kAIxU Gateway Delta — Security & Data Handling
This document explains, in procurement-ready detail, how kAIxU secures AI access across a 60+ app ecosystem:
a Netlify console (“the UI”), a governed gateway (“the Gate”), and a hardened control plane (“Gateway13”)
that issues keys, enforces policy, meters usage, and—when enabled—powers true semantic retrieval (RAG).
Secure by default · gate closed unless authenticated
Least privilege · master keys + scoped sub-keys
Audit-ready · metering + exports + device seats
RAG-ready · embeddings lane inside gateway
1) System Overview: what’s public, what’s private, what’s governed
kAIxU is designed as infrastructure. The public-facing surface is the UI console. The execution plane (the “brain”)
sits behind the Gate. All apps, tools, and integrations speak to the same gateway contract—so security policy is enforced
once, centrally, instead of 60 different times in 60 different frontends.
Core design principle: apps never talk to model vendors directly, and users never hold vendor keys. Apps talk to
the kAIxU Gate with a kAIxU token. The Gate routes, enforces policy, meters usage, and streams responses.
The live Gate Delta contract published on the console includes:
GET /v1/health
GET /v1/models
POST /v1/generate
POST /v1/stream (SSE)
(Gate Delta page).
Health: /v1/health → reports keyConfigured, authConfigured, and whether gate is open
Models: /v1/models → returns allowed models (router visibility)
Generate: /v1/generate → non-stream response (policy enforced per token)
Stream: /v1/stream → SSE streaming (policy + metering apply)
Verified posture: the live health endpoint reports the gate is not open by default (
openGate: false)
and that authentication + key configuration are enabled (
health endpoint).
2) “Security Boundary” clarified (what actually stops attackers)
Browser controls like CORS are not the primary security boundary. They reduce certain browser-based abuse, but
a motivated attacker can call an endpoint server-to-server and bypass browser policies entirely.
The real boundary in kAIxU is the kAIxU key system and the policy engine behind it:
authentication, scoped privileges, caps, rate limits, device binding, model/provider restrictions, revocation, and auditing.
Enterprise-safe framing: “Origins can be permissive for integration flexibility, because access is enforced at the token layer.
Without a valid kAIxU token, the gate hard-fails. With a token, policy constrains what’s allowed.”
Gateway13 is the control plane that provisions customers, issues master keys, creates scoped sub-keys, meters usage,
exports audit data, and manages embeddings/RAG (Gateway13 page).
Keys: master key shown once + sub-keys with overrides
Policy: caps, RPM/RPD, max devices, install_id requirement, provider/model allowlists
Audit: events + summaries + CSV exports + monthly usage view
RAG: embeddings lane (ingest → embed → store → search) under same auth model
3) Identity, Access, and Tenant Controls
kAIxU uses a deliberate hierarchy: a customer master key and optional sub-keys. This structure enables least privilege:
you can authorize a whole company while still restricting each app, developer, environment, or workflow to only what it needs.
- Master key issuance (shown once): The control plane issues a customer key that is displayed once and must be stored securely. This reduces accidental exposure and forces correct operational behavior. (Gateway13 UI behavior is described on the admin page.)
- Sub-keys with scoped privileges: Sub-keys can override limits and scope—this prevents “one key unlocks everything.” If a sub-key leaks, its blast radius is limited by design.
- Device-seat control: Policies support a maximum device count and optional install binding (install_id). This is direct phishing resistance: a copied token is dramatically less useful when it can’t be replayed from unlimited new devices.
- Hard revoke + rotation: Revocation is immediate cutoff. Rotation is issued as a new key with the same settings, while the old key is revoked. This is what “incident response ready” looks like in practice.
Why this matters: Most “AI wrappers” die in production because they treat keys like static passwords. kAIxU treats keys like governed access objects with policy, audit, and containment.
4) Spend, Rate, and Abuse Controls (Blast Radius Management)
kAIxU includes guardrails that constrain cost and prevent runaway usage. These controls are not just for billing; they are security controls that
prevent one compromised integration from turning into unlimited spend or denial-of-service.
- Monthly caps: A customer can be assigned a monthly spend cap. Sub-keys can have smaller caps or override policies for specific apps.
- Rate limiting: Requests-per-minute (RPM) constraints can be applied per key. Optional per-day limits (RPD) provide additional containment.
- Model/provider restrictions: Keys can restrict access to specific providers and models. This becomes both compliance control (approved models only) and cost control (fast models only).
Operational outcome: even if a token is abused, the system is designed to self-limit. The attacker hits caps, rate limits, or device bounds long before damage becomes existential.
5) Data Handling Defaults: transcripts vs knowledge, and what “memory” really means
“Memory” can mean two radically different things in AI systems:
- Transcript retention: storing raw conversational messages over time (high sensitivity).
- Knowledge retrieval (RAG): storing business documents (or embeddings derived from them) so the system can retrieve relevant facts when responding (configurable sensitivity).
The kAIxU architecture is designed so these can be separated. A client can run with minimal retention while still gaining “intelligent recall”
through document retrieval.
Client-friendly promise: by default, you can operate without saving raw chat transcripts to cloud storage. When a client wants long-term intelligence, they enable knowledge ingestion (RAG) and control retention/namespace.
The Gateway13 admin console includes an Embeddings Lane described as “True Semantic RAG inside the Gateway,” supporting
an ingest → embed → store → search flow under the same authenticated key model. It also states an architecture approach that routes
embeddings inside the gateway and stores vectors in a pgvector-backed database (e.g., Neon), with cosine similarity search and metadata filtering.
(Gateway13 Embeddings Lane section).
6) Streaming (SSE) and Reliability Verification
Streaming is not a cosmetic feature. In production, streaming reduces user abandonment, improves perceived latency, and enables longer outputs without “timeout theater.”
Many gateways break here. kAIxU treats SSE as first-class.
- Dedicated streaming endpoint: Gate Delta exposes an SSE stream endpoint as part of the canonical API surface.
- Smoke test verification: The console includes a “Smoke Test” that checks health, models, CORS preflight, non-stream generate, and SSE streaming.
This is explicitly described as verifying the upstream worker is online and that auth + key loading are correct.
(Smoke Test page).
Why buyers care: “We can prove the system is healthy right now” is stronger than “we promise it usually works.”
7) Auditing, Exports, and “Evidence-First” Operations
Secure systems are not just locked; they are measurable. Gateway13 is structured to provide operational evidence:
what happened, when, under which key, from which device seat, and what it cost.
- Monthly usage visibility: Usage can be viewed by month with event details and totals.
- Exports for accounting and investigation: Events CSV, Summary CSV, and Invoice CSV exports are exposed as standard tools.
- Device seat table: Install IDs and device attributes (first/last seen, UA) allow investigation of suspicious access patterns.
Procurement translation: “This isn’t a black box. We can show usage, enforce limits, prove access policy, and produce audit artifacts.”
8) Integration Safety: what developers do (and do not) have to do
kAIxU is designed to reduce integration mistakes. The gateway contract is standardized across tools and apps:
developers integrate once, using a kAIxU token, instead of wiring vendor SDKs and scattering secrets across multiple apps.
- No vendor API keys in the browser: Apps call the gate; the gate holds vendor credentials server-side.
- Central enforcement: caps, rate limits, device binding, and model restrictions are enforced centrally—so new apps inherit the same security posture.
- Scoped keys per app/environment: recommended best practice is one sub-key per app per environment, with least-privilege configuration.
Developer outcome: fewer places to leak secrets, fewer inconsistent policies, and faster onboarding of new tools into the ecosystem.
9) Threat Model Summary (Plain English)
kAIxU is built to resist common real-world failure modes of AI integrations.
- Stolen token: limited by caps, rate limits, optional install_id binding, max devices, and rapid revoke/rotate.
- Vendor key exposure: prevented because the browser never holds vendor keys; the gate owns them.
- Runaway cost: prevented by spend caps and metering controls.
- Abuse spikes: throttled by RPM (and optionally RPD) enforcement.
- Model compliance requirements: handled by per-key provider/model restrictions.
- RAG leakage risk: reduced by tenant namespace separation (collections), and by storing knowledge separately from raw transcripts, enabling tighter retention policies.
Key idea: flexibility exists at the integration layer, while enforcement exists at the token/policy layer. That’s how you scale to many apps without turning into chaos.
10) Deployment Modes and Client Options
kAIxU can be delivered in multiple operational models depending on client maturity:
- Managed gateway (recommended): Skyes Over London LC hosts, hardens, monitors, and maintains the gate and key system.
- Client-tenant dedicated keys: each client receives master + sub-keys with enforced caps and scopes; usage exports provide billing and audit evidence.
- Client-controlled allowlists: origins and/or environments can be locked down as client deployments stabilize, while maintaining the same key enforcement model.
- RAG enablement: clients can enable document ingestion to power semantic recall, with collection/namespace controls and configurable retention strategy.