Understand Nukipa

Architecture & core concepts

Nukipa is an operated marketing platform that sits behind your website: a CMS, a content engine, lead capture, a lightweight CRM, newsletters, nurturing, and analytics. This article explains how those pieces fit together — the tenant model, what Nukipa runs versus what stays in your repo, the three ways you interact with it (dashboard, API, agent), and the public API your site talks to.

If you just want to ship content, you don't need any of this. Read it when you're deciding where Nukipa ends and your codebase begins, or when something doesn't behave the way you expected and you need a mental model to debug against.

[!NOTE] This is the orientation map, not the full surface. The platform has grown past the six services described in detail below — there are also newsletters, nurturing, social, audits, ingestion, deployer, and an assistant service. They get their own docs. Here we cover the content/lead core and the boundary between Nukipa and your site, and flag the rest where it touches that boundary.

The tenant (a.k.a. workspace)

A tenant is the top-level container for one customer's marketing: its content, leads, brand context, analytics, and members. In the dashboard it's labelled a "workspace"; in the API and database it's a tenant. Same thing — the gateway's tenants service mirrors the table's id to a tenant_id field in its responses (in withTenantId()) so both names resolve to one row, and the SPA reads it under tenant_id.

Every piece of data in the system is scoped to exactly one tenant. Tables carry a tenant_id that references public.tenants(id) with on delete cascade, so deleting a tenant cascades to everything under it. There is no cross-tenant data — a post, a lead, a CTA, a visit all belong to one tenant and are invisible to every other.

A tenant has:

Field	Notes
`id`	UUID, the canonical tenant identifier
`slug`	URL-safe, unique. Either supplied verbatim, or auto-derived from the name. Used to build the platform subdomain `<slug>.<platform-host>`.
`name`	Display name
`settings`	JSONB bag for tenant-level config (e.g. `image_provider`)

Slug allocation has two paths. If you supply a slug, it's honoured verbatim and a collision returns 409 Conflict. If you don't, Nukipa derives one from the name and retries with -2 … -8 suffixes when the unique constraint trips; if all eight are taken it appends a random 5-char base36 suffix, and only if that also collides does it return 409 ("could not allocate a unique slug").

Members join a tenant through tenant_members with one of four roles:

Role	Can
`owner`	everything, including managing other owners
`admin`	manage members and all content
`member`	create and edit content
`viewer`	read-only

One guard worth knowing: you can't remove the last owner of a tenant. RLS doesn't enforce that — the tenants service checks it explicitly (counts the owners) before deleting a membership and returns 403 if you'd orphan the workspace.

[!NOTE] Tenants and invitations are the one corner of the system where access is enforced by Postgres Row-Level Security against your own user token. Everywhere else, the gateway has already authorized you and internal services run with elevated DB access scoped manually by tenant_id (more on that below).

What Nukipa runs vs what stays in your repo

Nukipa owns the content, leads, context, and analytics. Your site — the pages, layout, routing, styling — is a separate thing. How that site is built and hosted is where you have a choice; the content underneath is the same regardless.

There are three ways to run the site:

Self-host. You build and host the site yourself (Next.js on Vercel, a static export, your own server). It reads content and ships signals back to Nukipa over the public API using @nukipa/site-sdk. You own the deploy.
Nukipa-managed site. Nukipa scaffolds a Next.js 15 site from a starter template (templates/site-nextjs/), keeps it in a repo, and hosts it on Nukipa-controlled infra via the deployer service (GitHub-connect → deploy → Vercel/managed infra). Per-tenant sites live under sites/.
Built-in blog. If you don't want a custom site at all, Nukipa runs a built-in blog app (apps/public, Nuxt SSR) that serves your posts directly off the same public API at <slug>.<platform-host> or a custom domain.

[!NOTE] Site-mode (composable multi-page sites — pages, navigation, events) is partly live, not finished. The public API and SDK already expose page, navigation, event, newsletter, and audit reads (see the surface table below), and the managed Next.js starter consumes them; but it's newer and thinner than the blog path. If you need a full marketing site today, the managed Next.js template is the route — treat the surfaces beyond posts/forms as still settling.

Inside Nukipa, several internal services each own one domain. The six covered here:

Service	Owns
cms	blog posts (markdown with embedded markers), draft/publish history, components, sources, facts, assets, lead-capture forms, form submissions, CTA clicks
crm	contacts (lead → mql → sql → customer → disqualified lifecycle), companies, deals, a per-contact activity timeline, AI lead classification
context	a path-addressed "wiki" of brand context — company profile, industry, products, ICP, USP, writing style — plus campaigns and the CTA registry. This is the grounding the content engine and lead classifier read from.
signals	analytics and visibility — page visits, search queries, keyword tracking, competitor scrapes, ChatGPT-visibility prompt tests, external news/trends
llm	the only place with model API keys; routes to OpenAI or Anthropic and logs every call for cost and audit
jobs	tracks long-running work (content generation, classification, refreshes) and reports progress back to the dashboard

The content engine is not a separate product — it's a set of background workers living inside the cms and signals services. When you (or an agent) ask for a blog post, the cms enqueues a cms.generate-blog-post job; a worker resolves your brand context from the context wiki, runs a writer model with live web search, extracts citations/components/facts, and saves a draft. It does not auto-publish — a human reviews and clicks Publish. Batch generation (a week or two of posts) and fact verification work the same way: enqueue, work in the background, report progress.

The Nukipa / your-repo boundary

The cleanest way to think about what lives where:

Lives in Nukipa	Lives in your repo
Post bodies, versions, components, sources, facts	Your site's pages, layout, routing, styling
Lead-capture form schemas + submissions	The form's rendered markup and UX
Contacts, companies, deals, activities	—
Brand context wiki, campaigns, CTA registry	—
Images/PDFs uploaded as assets (Supabase Storage)	Your own static assets
Analytics: visits, CTA clicks, search queries	The tracking call you fire on page view
Newsletters, sequences, sends	—

[!NOTE] For the managed site, the "your repo" column still applies — it's just a repo Nukipa scaffolds and deploys for you. The boundary is the same; only who pushes to the repo differs.

Your site fetches content at request time and renders it. Nukipa ships two things to make that painless:

@nukipa/site-sdk — a typed client for the public API. You give it a gateway URL and a way to resolve the request host. Core methods: getTenant, listPosts, getPostBySlug, listRelatedPosts, listFolders, getFormBySlug, submitForm, recordVisit, recordCtaClick. The client also covers the (newer) headless-CMS and engagement surfaces — getTenantSeo, submitContactForm, listPages, getPageBySlug, listEvents, getNav, plus newsletter (subscribeNewsletter, confirmNewsletter, …) and audit (runAudit, getAuditRun, …) method sets.
A post renderer — a <PostBody> component that renders the post's markdown-with-markers body, resolving embedded components (callouts, FAQs, charts, CTAs, lead forms, etc.) into real elements. It ships in two flavours: @nukipa/post-renderer-react for Next.js / React sites (sites/, templates/site-nextjs/), and @nukipa/post-renderer-vue for the built-in Nuxt blog (apps/public). Pick the one matching your stack.

A minimal blog index in a Next.js app, using the SDK and the React renderer:

import { createNukipaClient } from '@nukipa/site-sdk';
import { headers } from 'next/headers';

const client = createNukipaClient({
  gatewayUrl: process.env.NUKIPA_GATEWAY_URL!,
  getHost: async () =>
    process.env.NUKIPA_TENANT_HOST
    || (await headers()).get('x-forwarded-host')
    || (await headers()).get('host')
    || ''
});

const tenant = await client.getTenant();
const posts  = await client.listPosts({ limit: 10 });
await client.recordVisit({ path: '/blog' });

The post body format is markdown with three kinds of marker — the same string agents emit, the editor round-trips, and your site renders:

# A blog post

Some text {{⁠cite:1}}.

{{component:9b8c…}}

{{⁠fact}}The market grew 30% in 2024{{⁠/fact}}{{⁠cite:2}}

{{component:UUID}} resolves to a stored component, {{cite:N}} to the N-th source (1-based, by idx), and {{⁠fact}}…{{⁠/fact}} marks a verifiable claim. You don't parse this yourself — <PostBody> does — but it's worth knowing the body is one portable string, not a pile of HTML.

[!TIP] Tenant resolution on the public API is by host. Reads are resolved from X-Forwarded-Host: a verified custom domain maps to your tenant, otherwise the platform subdomain <slug>.<platform-host> does. So in production you set NUKIPA_TENANT_HOST (or let your reverse proxy forward the real host) and the same code serves the right tenant's content. Without a host, recordVisit either drops the visit or persists the gateway's own host — so always supply it.

Dashboard vs API vs agent

There are three ways to drive the same engine. They all end up at the same internal services through the same gateway — they differ in who's holding the controls.

Surface	Who uses it	Auth	Talks to
Dashboard (`apps/ui`, Vue SPA)	humans	Supabase user JWT + `X-Tenant-Id`	gateway `/api/*`
API	your own backend / scripts	personal API key `nk_…` (or OAuth bearer)	gateway `/api/` and `/public/v1/`
Agent (MCP)	ChatGPT, Claude, other MCP clients	OAuth 2.1 bearer `nk_…`	gateway `/mcp`

The dashboard is the human cockpit: write and edit posts in a WYSIWYG editor, review and publish, manage leads, configure forms and CTAs, watch analytics. It signs in directly to Supabase Auth, picks a tenant, and calls the gateway with your JWT. There are no save buttons — the autoSave mixin debounces field edits (600 ms default) into PATCHes, and the SPA subscribes to Supabase Realtime so changes and background-job progress stream in without reloads. Realtime echoes are merged only into fields you aren't currently editing, so a co-editor's change doesn't clobber what you're typing.

The API is the same set of operations, reachable programmatically. Most server-to-server reads your site does (content, forms) go through the no-auth public surface; authenticated operations use an nk_… API key.

The agent path is MCP (Model Context Protocol). An MCP client — ChatGPT, Claude — registers via OAuth, you approve it against a specific tenant, and it gets a scoped nk_… bearer. From then on the agent can call the same operations as tools: cms_create_post, cms_generate_post, crm_search_contacts, cms_publish_post, and so on. The agent is not a fourth backend — it's a client of the same gateway, holding a token scoped to one tenant. OAuth issues an nk_… access token and an nkr_… refresh token; both live in public.api_keys alongside personal API keys.

The same operation is exposed all three ways on purpose. A human publishing in the dashboard, your nk_…-keyed script publishing, and an agent calling cms_publish_post all hit the identical code path.

[!NOTE] Whoever the caller is, the internal services can tell. The gateway stamps a role onto every internal request: owner/admin/member/viewer for humans, apikey for API/OAuth callers, and service:<name> for service-to-service calls. Services use this to suppress side effects that only make sense for humans — for example, a lead created by a form submission (service:cms) lands unowned, rather than being assigned to whoever happened to trigger it.

The gateway: the public front door

Almost everything goes through one front door. The gateway (apps/gateway) is the only component exposed to the internet. The internal services bind to 127.0.0.1 and never take a request directly — only the gateway can reach them.

The gateway owns every concern that has to happen before "real work":

CORS and per-tenant rate limiting
Auth — verifying the Supabase user JWT, or the nk_… bearer
Tenant membership — confirming you actually belong to the tenant you named in X-Tenant-Id
OAuth 2.1 — the registration/authorize/token dance for MCP clients
MCP — the /mcp endpoint agents call
Tenants & invitations — handled in-process (with your user token, so RLS applies)
The public surface — the no-auth read/write endpoints your site uses, plus a handful of inbound webhooks (Resend, Post for Me, ingestion sources) that the gateway verifies/forwards

For an authenticated app request, the flow is:

The browser (or your script) calls the gateway with Authorization: Bearer <jwt-or-nk-key> and X-Tenant-Id: <uuid>.
The gateway validates the token and confirms (user, tenant) membership.
For /api/<service>/*, it mints a short-lived internal token (a 60-second HMAC-signed JWT carrying { sub, tenant_id, role, email }) and proxies the request to the right internal service.
The internal service verifies that token and runs the query, always filtered by tenant_id.

That last point is the load-bearing invariant. Internal services use an elevated (service-role) database client — RLS is not their safety net. Their safety net is the rule that every single query scopes by req.tenantId. It's documented loudly in each service's tenantDb.js, and it's the thing to check first if you ever suspect cross-tenant bleed.

The public surface (`/public/v1/*`)

Your site never holds a user JWT or an API key. It talks to a no-auth, host-resolved, rate-limited subset of the gateway. The core reads and writes:

Route	Purpose	Limiter
`GET /public/v1/tenant`	tenant card for the host	read
`GET /public/v1/tenant/seo`	IndexNow ownership key (404 when not onboarded)	read
`GET /public/v1/posts`	published posts (from the published snapshot); `?folder=` narrows	read
`GET /public/v1/posts/:slug`	one post + components + sources	read
`GET /public/v1/posts/:slug/related`	related posts	read
`GET /public/v1/folders`	folders with ≥1 published post	read
`GET /public/v1/forms/:slug`	form schema (404 if missing/disabled)	read
`POST /public/v1/forms/:slug/submit`	lead capture → creates a CRM contact	submit
`POST /public/v1/posts/:postId/contact-form-submissions`	inline contact-form submission	submit
`POST /public/v1/cta-clicks`	CTA click ingest	visit
`POST /public/v1/signals/visits`	page-view ingest	visit
`POST /public/v1/signals/visits/confirm`	cookieless proof-of-JS beacon	visit

That's the subset a typical site uses. The gateway also fronts more, behind the same no-auth surface and (for the last block) feature flags: GET /public/v1/events and /events/:slug, GET /public/v1/data/records[/:id] (a Nukipa-managed site reading its own ingested records), GET /public/v1/skill/:name.zip (a Claude Code skill download), inbound webhooks (POST /public/v1/webhooks/resend, /webhooks/postforme, /ingestion/webhooks/:slug), and — when enabled — four audits/* routes and the newsletters/* subscribe/confirm/unsubscribe routes. Don't treat the core table as the whole surface; treat it as the part you'll touch first.

[!NOTE] The three rate-limit buckets behave differently. The read limiter is currently a no-op passthrough (anonymous public data is cheap and idempotent; real abuse protection is expected at the edge). The submit bucket is large and per-IP, sized so server-proxied cross-tenant traffic doesn't collapse onto one counter. The visit bucket caps at ~1200/min per IP. Cranking these is an env change (PUBLIC_SUBMIT_RATE_LIMIT), not a deploy.

Two properties matter here. Reads serve a published snapshot, not the live draft — when you publish, the cms copies the body/components/sources into an immutable version row (post_versions) and sets current_version_id; the public read serves that. Editing a draft doesn't change what's live until you publish again; unpublishing clears current_version_id, so public reads start 404'ing immediately. Tracking writes are best-effort at the gateway — the visit and CTA-click endpoints answer 204 even when the downstream call misfires, so a flaky analytics ping never blocks a page from rendering.

[!WARNING] "Best-effort" is a gateway contract, not an SDK return value. At the SDK layer the calls don't pretend to succeed: recordVisit returns the result only on a 201, otherwise null (and null on any thrown error); recordCtaClick returns void and swallows errors entirely. So a dropped tracking call is observable in your code (null), it just never surfaces to the visitor.

A form submission is the one public write that reaches across services, and it's the seam where "your site" becomes "a lead in the CRM":

visitor submits form
  → POST /public/v1/forms/:slug/submit   (gateway, no auth, host-resolved)
  → cms resolves tenant by form slug
  → cms inserts cms.form_submissions (with ip_hash, never raw IP; status='received')
  → cms calls crm POST /contacts  (role=service:cms, classify=true)
  → crm inserts crm.contacts (stage=lead) and enqueues classification
  → crm returns { id }; cms back-fills crm_contact_id and sets status='crm_synced'
  → response passes back through to the visitor

The lead then gets classified in the background — a crm.classify-contact job pulls your ICP/industry/product/USP context from the context wiki and scores fit and intent — without the visitor's request waiting on any of it.

[!WARNING] The CRM hop can fail and the visitor still succeeds. The submission status defaults to crm_failed and only flips to crm_synced after the CRM returns an id. If POST /contacts errors, the cms logs it, leaves crm_contact_id null, keeps the row at status='crm_failed', and still returns success to the visitor. Nothing is lost — the lead is captured in cms.form_submissions — but it isn't a CRM contact until you reconcile it. The dashboard inbox surfaces these; you can override the status there. (The inline contact_form component returns this status in its response body, { id, status }; status is one of received / crm_synced / crm_failed / spam.)

A note on the CRM lifecycle

CRM contacts carry two orthogonal fields, which is easy to conflate:

stage — lifecycle: lead → mql → sql → customer → disqualified. Where the contact is in the funnel.
status — disposition, with its own POST /contacts/:id/status route (and a disqualified_reason captured when set to unqualified). How you've dispositioned the contact, independent of stage.

Form-created leads land at stage='lead' and unowned (because the creator is service:cms). The classifier suggests a stage/status but doesn't move the contact itself — a human or agent applies the change.

FAQ

Is my website hosted by Nukipa? It can be, or not — your choice. Three options: self-host your own site against the public API; let Nukipa scaffold and host a Next.js site for you (the deployer service handles deploys on Nukipa-controlled infra); or use the built-in Nuxt blog (apps/public) if you don't want a custom site at all. In every case the content, leads, context, and analytics live in Nukipa.

What's the difference between a tenant and a workspace? Nothing — they're the same object. The dashboard calls it a workspace; the API and database call it a tenant. One customer's marketing = one tenant.

Which post renderer do I use? @nukipa/post-renderer-react for a Next.js/React site (including the managed templates/site-nextjs/ starter and anything under sites/). @nukipa/post-renderer-vue is what the built-in Nuxt blog (apps/public) uses. Same body format, same <PostBody> component name, different framework.

Can an agent do everything I can do in the dashboard? Largely yes — the agent calls the same gateway operations as MCP tools (cms_create_post, crm_search_contacts, etc.), scoped to the tenant you approved it against. The dashboard, an nk_…-keyed script, and an agent all drive the same internal services through the same front door.

Why doesn't my site just talk to the database directly? Because the gateway is where auth, tenant-membership checks, rate limiting, host-based tenant resolution, and cross-service composition (like form-submit → lead) live. The public API is intentionally small and read-mostly so it's safe to call from an anonymous visitor's browser.

I published an edit but the live site didn't change — why? Public reads serve the published snapshot, not your live draft. Editing changes the draft; the live post only updates when you publish again (which writes a new post_versions row and re-points current_version_id). Unpublishing takes the post offline immediately — it starts returning 404.

A form was submitted but the lead isn't in the CRM — where did it go? It's in cms.form_submissions with status='crm_failed'. The CRM call failed (or the CRM was unreachable) at submit time, but the submission itself is never dropped. Check the dashboard inbox for that post/form; you can re-classify or override the status there.

Where do brand voice and ICP come from? The context service — a path-addressed wiki of your company profile, industry, products, ICP, USP, and writing style. It's populated during onboarding (by crawling your company URL) and is the grounding both the content writer and the lead classifier read from. The CMS and CRM are consumers of it, not owners.

Served live from the platform · /docs/architecture-and-concepts