3.1 · Platform Architect · Architecture
Book 1 · Ch 9 · Architecture for Survivability

Platform Spec v1: Level 3 Self-Serve Architecture

L3 architecture template covering multi-tenant data plane, control plane, identity, and observability. NIST 800-53 r5 control mapping. Specific component choices populate once L2 is stable and three paying customers are on the first SKU.

3.1 · Platform Architect · artifact id: platform-spec-v0.html · v0 · 2026-05-28
Format stub. This spec establishes the L3 architecture topology, control mapping approach, and decision gates. Specific component selections (GovCloud services, identity provider, SIEM), control inheritance percentages, and boundary definitions populate once the L2 productized service is stable and customer environments are confirmed.

Why Ch 9 is the load-bearing chapter for this spec

Ch 9 of Shrink-Wrap It states: "Architecture decisions made in month one determine authorization feasibility in year two, and operational sustainability in year five." The L3 platform spec is not a Phase 3 concern. The architectural decisions that enable L3 self-serve must be designed into L2, even if the self-serve interface ships later. Retrofitting a multi-tenant control plane onto a single-tenant architecture costs the same as a rewrite. This spec documents the L3 target so that every L2 build decision is tested against it.

Architecture levels defined

Level Description NorthAI target phase Customer provisioning
L1 Productized service: fixed scope, fixed deliverables, human-driven delivery Current state (pre-engagement) Manual, 4-6 week lead time
L2 Managed platform: software-delivered, repeatable, but provisioning requires NorthAI staff involvement Phase 1-2 (Months 1-12) Semi-manual; NorthAI provisions each tenant; 1-2 week lead time
L3 Self-serve platform: customer can onboard, configure, and expand without NorthAI staff involvement Phase 3 (Months 10+, conditional) Self-serve; 24-hour or same-day provisioning

L3 architecture topology

Data plane (multi-tenant)

The data plane handles the runtime operations: authentication decisions, posture data ingestion, and dashboard rendering. For L3 to work with FedRAMP Moderate authorization, the data plane must be multi-tenant but with strict tenant isolation at the data layer.

Component Function Isolation model Component choice (TBD)
Tenant data store Persists per-tenant posture data, audit logs, configuration Row-level security or schema-per-tenant; shared infrastructure, isolated data [TBD: AWS RDS row-level security / Aurora Serverless / DynamoDB with tenant partition key]
Auth decision engine Evaluates authentication requests; enforces policy per tenant Tenant-scoped policy evaluation; no cross-tenant policy bleed [TBD: Open Policy Agent / custom policy engine]
Posture data ingestion Receives agency system health signals; normalizes for dashboard Per-tenant ingestion queues; no shared queue across tenants [TBD: SQS per-tenant queues / EventBridge with tenant routing]
Dashboard rendering Serves the federal-posture dashboard to agency end users Session isolation; tenant context validated at request time [TBD: Next.js SSR / static generation with per-tenant config injection]

Control plane

The control plane handles tenant lifecycle operations: provisioning, configuration changes, user management, and billing events. This is the layer that enables L3 self-serve. In L2, the control plane is operated by NorthAI staff. In L3, it is exposed via API and UI to agency admins.

Component Function L2 implementation L3 target
Tenant provisioning Creates a new tenant environment: data store, identity config, ingestion pipeline Manual runbook executed by NorthAI ops; 1-2 week SLA API-driven; agency admin triggers provisioning via self-serve portal; 24-hour SLA
Configuration management Allows tenant-specific configuration of Zone 2 (configurable surface) NorthAI CS configures on behalf of agency Agency admin configures via self-serve UI; NorthAI retains override capability for security controls
User lifecycle Add, remove, and modify user roles and access levels within the tenant NorthAI provisions via ticketed request; agency submits user list Agency admin self-serves; SCIM provisioning from agency IdP where available
Billing events Tracks user count and usage for subscription invoicing Manual export monthly; NorthAI invoices based on roster Automated; usage metering triggers invoice generation; agency can view usage via dashboard

Identity layer

Requirement Standard Implementation approach (TBD)
PIV / CAC authentication HSPD-12, FICAM [TBD: integrate with agency ICAM via SAML 2.0 or OIDC; PIV-enabled IdP as upstream]
MFA enforcement OMB M-22-09 (phishing-resistant MFA required for federal workers) [TBD: FIDO2 / WebAuthn as primary; PIV as fallback]
Federation across agencies FICAM Trust Framework [TBD: connect.gov integration or per-agency SAML federation]
Session management NIST SP 800-63B [TBD: session timeout per agency policy; re-auth triggers on sensitive operations]

Observability

Signal type Purpose ConMon use Implementation (TBD)
Infrastructure logs Capture all system-level events for audit trail Required for POA&M evidence; SIEM input for anomaly detection [TBD: AWS CloudTrail + CloudWatch Logs / third-party SIEM on GovCloud]
Application traces Distributed tracing across control and data plane for latency and error root cause Performance baseline for SLA reporting; issue diagnosis [TBD: OpenTelemetry with GovCloud-compatible backend]
Security events Authentication failures, privilege escalations, anomalous API calls Required for ConMon; feeds into incident response playbook [TBD: SIEM with FedRAMP Moderate authorization; alerts to NorthAI SOC and agency ISSO]
Business metrics Per-tenant active users, feature utilization, API call volume Not ConMon; feeds into Customer Success cadence (see nps-outcome-cadence-v0) [TBD: product analytics with tenant isolation; no PII in metrics layer]

NIST 800-53 r5 control mapping (illustrative)

The architecture above targets FedRAMP Moderate authorization. Under Moderate, approximately 325 controls apply. The control inheritance model (Ch 9: "Typical breakdown for Moderate on GovCloud: Inherited 85 controls (26%) / Shared 65 (20%) / Customer 175 (54%)") drives the architectural choices above. The table below maps key architecture decisions to their primary control families.

Architecture decision Primary NIST 800-53 r5 family Key controls Inheritance potential
Multi-tenant data store with row-level security Access Control (AC) AC-3, AC-4, AC-16 Partial from GovCloud IaaS (infrastructure layer); application layer is customer-responsible
PIV / CAC + FIDO2 identity layer Identification and Authentication (IA) IA-2, IA-5, IA-8 Low; identity configuration is customer-responsible per FICAM requirements
Infrastructure + security event logging to SIEM Audit and Accountability (AU) AU-2, AU-3, AU-6, AU-12 Moderate; GovCloud CloudTrail provides infrastructure events; application events are customer-responsible
GovCloud deployment region System and Communications Protection (SC) SC-5, SC-7, SC-8, SC-28 High; GovCloud inherited controls cover physical protection, network boundaries, and encryption at rest/transit
Configuration management (control plane) Configuration Management (CM) CM-2, CM-3, CM-6, CM-8 Low; configuration of NorthAI-specific components is customer-responsible