Skip to content

Architecture Design Document

Version: 1.0  ·  Date: May 2025  ·  Status: Approved

This document defines the technical architecture for the Construo platform — a configurable, multi-tenant construction project management SaaS.


1. Executive Summary

The platform is designed to be sold to multiple construction companies, each with different data capture requirements, workflows, and identity providers.

Core priorities:

  • Maintainability for a small team starting at 2 engineers
  • Security and compliance from day one (ISO 27001, SOC 2, Cyber Essentials)
  • Offline-first capability for construction sites with poor connectivity
  • Configurability — tenants define their own schemas, modules, and branding
  • A clear growth path from 4 tenants at launch to 100+ at scale

Technology decisions at a glance

Layer Decision
Backend FastAPI (Python 3.12)
Frontend React + TypeScript (Vite)
Database PostgreSQL on AWS RDS — schema-per-tenant
Auth AWS Cognito + Entra ID SAML federation
Offline sync PowerSync (managed sync engine)
Hosting AWS — ECS Fargate, S3, CloudFront
IaC Terraform
Primary region eu-west-2 (London), eu-central-1 (Frankfurt)
V1 scope Manual data entry, all core construction modules
V2+ scope AI features — Smart Import, Licence Scanning

2. System Architecture

High-level layers

Layer Components Responsibility
Edge CloudFront + WAF + Route 53 CDN, DDoS, tenant subdomain routing, TLS termination
Frontend React SPA on S3 + CloudFront UI, offline-capable PWA, service worker, local cache
Sync PowerSync Cloud Offline-first data sync between client and backend
API FastAPI on ECS Fargate Business logic, REST API, tenant context, auth middleware
Async SQS + Lambda Background jobs, imports, notifications, scheduled tasks
Data RDS PostgreSQL (schema-per-tenant) Persistent structured data with tenant isolation
Cache ElastiCache (Redis) Session data, rate limiting, short-lived computations
Files S3 (per-tenant prefixes) Documents, images, exports, licence scan uploads
Identity Cognito + Entra ID Authentication, federation, token issuance
Secrets AWS Secrets Manager DB credentials, API keys, tenant config
Observability CloudWatch + X-Ray + Sentry Logs, tracing, alerts, application errors

Request flow

A typical authenticated API request:

  1. User browser or mobile app requests acme.construo.io
  2. Route 53 resolves the wildcard subdomain to CloudFront
  3. CloudFront applies WAF rules, forwards to ALB
  4. ALB routes to ECS Fargate task running FastAPI
  5. FastAPI middleware extracts tenant from subdomain, loads tenant context from Redis (fallback to DB)
  6. JWT validated against Cognito. Tenant membership and RBAC role attached to request context
  7. Business logic executes against the tenant's PostgreSQL schema
  8. Response returned. Async side-effects (audit logs, notifications) dispatched to SQS

Why FastAPI over Next.js

The original recommendation was Next.js for developer speed. This was revised because the platform commits to Python for V2 AI features (Smart Import, Licence Scanning). Maintaining two backend languages — JavaScript API routes and Python AI services — would create unnecessary complexity. FastAPI is the right single-language choice.

OpenAPI as the contract: FastAPI auto-generates an OpenAPI spec. The frontend uses openapi-typescript to generate TypeScript types from it as a pre-build step. Any breaking API change fails the frontend build immediately — no manual type maintenance.


3. AWS Infrastructure

Service map

AWS Service Configuration Purpose
Route 53 Wildcard *.construo.io ALIAS to CloudFront Tenant subdomain DNS
CloudFront WAF attached, S3 + ALB origins, custom SSL CDN, edge security, TLS
AWS WAF OWASP rule set, rate limiting, geo-blocking Application firewall
ACM Wildcard cert *.construo.io per region TLS certificates
S3 (frontend) Static hosting, versioned, OAC React SPA assets
S3 (files) Per-tenant prefix, versioning, lifecycle rules Tenant documents, images
ECS Fargate 2 tasks min, autoscaling, 2 AZs FastAPI containers
ECR Private registry, image scanning enabled Docker images
ALB Path-based routing, health checks, access logs Load balancing
RDS PostgreSQL Multi-AZ, db.t3.medium → db.r6g.large at scale, encrypted Primary data store
ElastiCache Redis Cluster mode, 2 AZs, encrypted in-transit Sessions, rate limiting, cache
SQS Standard queues per job type, DLQ on each Async task queue
Lambda Python 3.12, VPC-attached for DB access Background job workers
Cognito User Pool per env, Entra ID IdP federation Identity and token issuance
Secrets Manager Automatic rotation, VPC endpoint Credentials and secrets
KMS CMKs for RDS, S3, SQS encryption Encryption key management
CloudWatch Log groups per service, metric alarms, dashboards Monitoring
CloudTrail All regions, S3 storage, 7-year retention Audit logging
VPC 3 AZs, public/private/data subnets, NAT GW Network isolation

Network architecture

Three subnet tiers:

  • Public — ALB, NAT Gateway only. Nothing else is directly internet-accessible.
  • Private — ECS Fargate tasks, Lambda functions. Outbound via NAT Gateway.
  • Data — RDS, ElastiCache. No outbound internet. Accessible only from private subnets.

Multi-region

Phase 1 deploys to eu-west-2 (London) only. EU data residency tenants are added to eu-central-1 (Frankfurt) from Phase 2. The tenant registry stores each tenant's home region; the application routes accordingly.

Environments

Environment AWS Account Data
development dev account Synthetic only
staging non-prod account Anonymised copy of prod
production prod account Real tenant data

Separate accounts, not separate VPCs

Use separate AWS accounts per environment. This is required for SOC 2 and ISO 27001 — it prevents staging credentials from accidentally accessing production resources.


4. Multi-Tenancy and Data Isolation

Strategy: schema-per-tenant

Each tenant gets a dedicated PostgreSQL schema within the shared RDS instance. All tenant tables live under their schema (e.g. acme.projects, acme.personnel). A shared public schema contains the tenant registry and global configuration.

Approach Notes
Schema-per-tenant (chosen) Strong isolation, simple GDPR deletion (DROP SCHEMA), per-tenant backup/restore, no row-level filtering bugs, good to ~200 tenants
Row-level security Highest risk — one missing WHERE clause leaks cross-tenant data
Database-per-tenant Maximum isolation but operationally expensive at launch scale

Tenant registry

The public.tenants table is the first lookup on every request:

Column Type Description
id UUID Tenant identifier
slug VARCHAR(63) UNIQUE Subdomain slug — acme for acme.construo.io
schema_name VARCHAR(63) PostgreSQL schema name
region VARCHAR(20) AWS region for data residency
plan ENUM starter / professional / enterprise
status ENUM active / suspended / trial / offboarded
idp_type ENUM NULL none / entra_id / okta / google
idp_config JSONB NULL SAML/OIDC metadata
retention_days INTEGER Configurable data retention period
modules_enabled TEXT[] List of enabled module identifiers

Tenant provisioning

When a new tenant is onboarded:

  1. Insert row into public.tenants
  2. CREATE SCHEMA {schema_name}
  3. Run Alembic migrations targeting the new schema
  4. Seed default config (field definitions, module settings, roles)
  5. Provision Cognito User Pool App Client
  6. Configure Entra ID federation if applicable
  7. Create S3 bucket prefix: s3://construo-files-prod/{tenant_id}/
  8. Set up CloudWatch log group and metric filters
  9. Provision tenant subdomain DNS record via Route 53

5. Authentication and Access Control

Architecture

AWS Cognito is the central identity broker. All authentication methods produce a consistent JWT regardless of the sign-in method.

Auth method Use case
Username + password Tenants without a corporate IDP
Entra ID (SAML 2.0) Primary enterprise federation — Microsoft 365 orgs
OIDC federation Other IDPs (Okta, Google Workspace, Ping)
API keys ERP integrations, machine-to-machine

RBAC roles

Role Scope Key permissions
Platform Admin Global Tenant management, platform config — Construo staff only
Tenant Admin Tenant User management, module config, field schema builder, IDP config
Project Manager Tenant Full CRUD on assigned projects and sites
Site Foreman Project Create/edit site diary, personnel, plant on assigned sites
Site Operative Site View site data, log own attendance, sign inductions
Viewer Project or Tenant Read-only
Integration Tenant API key role — scoped access for ERP sync

Permission enforcement

Three layers — defence in depth, required for SOC 2 and ISO 27001:

  1. FastAPI dependency injection — every route declares required permissions via Depends(require_permission('sites:write'))
  2. Database row-level checksschema_name is always part of query context. Parameterised queries only.
  3. Frontend route guards — React Router guards for UX only; API is the authoritative enforcement point.

Never trust client-supplied tenant IDs

Tenant context must always be derived from the verified JWT on the server side. This is the most common multi-tenancy vulnerability.

JWT custom claims

Cognito-issued JWTs contain these custom claims added via a Lambda trigger:

  • tenant_id — UUID of the tenant
  • tenant_slug — subdomain slug for routing
  • platform_roles — list of RBAC roles
  • project_access — list of project UUIDs (empty = all within tenant)

6. Offline Sync Architecture

Why this is non-trivial

Construction sites frequently have no mobile data coverage. Users must be able to record site diary entries, log personnel, capture plant movements, and raise incidents without internet. When connectivity returns, changes must sync reliably without data loss or corruption.

Building a reliable sync engine from scratch is a multi-month engineering effort with subtle failure modes. PowerSync is a managed service that solves this problem.

PowerSync integration

Component Technology Role
Backend connector PowerSync Python SDK + FastAPI webhook Publishes data changes to PowerSync Cloud
Sync rules PowerSync YAML schema Defines which tables/rows sync to which users
Client SDK PowerSync React SDK (web) / React Native SDK (mobile) Local SQLite, sync engine
Conflict resolution Last-write-wins with server authority Server is always authoritative

Sync scope

Not all data syncs to all clients:

  • Site Foreman: assigned sites, site diary entries (last 90 days), personnel list, plant register, pending forms
  • Project Manager: all sites within assigned projects, aggregated view data
  • Large binaries (documents, images) — fetched on-demand from S3 via signed URLs, cached by service worker

Conflict resolution

V1 uses last-write-wins with server authority. Each record has updated_at. When a client syncs, conflicts resolve in favour of the most recent server timestamp. Client changes are queued in a local SQLite upload queue and replayed in order when connectivity returns.


7. Core Data Model

Design principles

  • Core fields — fixed columns for universally required attributes (IDs, timestamps, relationships)
  • Custom fields — a JSONB custom_fields column on each entity stores tenant-defined attributes
  • Audit trail — every table has a corresponding _audit table capturing before/after state, user, and timestamp

Universal columns (all tables)

Column Type
id UUID DEFAULT gen_random_uuid()
created_at TIMESTAMPTZ DEFAULT NOW()
updated_at TIMESTAMPTZ DEFAULT NOW()
created_by UUID FK → users.id
updated_by UUID FK → users.id
deleted_at TIMESTAMPTZ NULL (soft delete)
custom_fields JSONB DEFAULT '{}'

Core entities

Key entities: projects, sites, site_diary_entries, personnel, site_attendance, plant_equipment, incidents, documents, deliveries, subcontractors, inductions, field_definitions.

Full column-level schemas are documented in Data Model and will be expanded as each module is built.


8. Configurability Engine

Three levels

  1. Field Schema — tenants add custom fields to any entity
  2. Module Configuration — tenants enable/disable platform modules
  3. White-Label — branding, subdomain, theme colours

Field definitions table

Column Type Description
entity_type VARCHAR(50) e.g. site_diary_entry, personnel
field_key VARCHAR(100) Snake_case key in custom_fields JSONB
label VARCHAR(255) Display label shown in UI
field_type ENUM text / number / date / boolean / select / multi_select / file / user_ref
is_required BOOLEAN Validation: field must be present
options JSONB NULL For select types: array of {value, label}
display_order INTEGER Sort order in forms

Field definitions are Redis-cached (5-minute TTL), invalidated when a tenant admin saves changes.

V1 Modules

Module key Description V1
site_diary Daily site diary entries
personnel Worker register and attendance
plant Plant and equipment register
documents Document register with version control
incidents Incident and near-miss reporting
inductions Site induction tracking and sign-off
deliveries Materials delivery log
subcontractors Subcontractor company management
timesheets Daily hours per worker per site V2
rams Risk assessments and method statements V2
snag_lists Inspections and punch lists V2
reporting Scheduled reports and dashboards V2
licence_scan AI-powered licence OCR scanning V2
smart_import AI-assisted spreadsheet import V2

9. Integration Architecture

Generic ERP integration layer

Rather than building bespoke connectors for each ERP system, the integration model uses a Transform Pipeline approach:

  1. Platform API — versioned REST API, all data accessible via authenticated API key
  2. Transform service — a small Python Lambda per ERP integration containing the field mapping logic
  3. ERP API / file — REST (Procore, Autodesk), file export (Sage, Viewpoint), webhook (Oracle)
  4. Scheduler — EventBridge triggers the transform Lambda on a defined cadence

Platform API principles

  • Versioned from day one: /api/v1/. Never break existing consumers.
  • Cursor-based pagination on all list endpoints
  • Webhook support: tenants register URLs for entity events
  • Rate limiting: per API key, enforced at ALB via WAF (1,000 req/min default)
  • Idempotency keys on write operations

V2 — Smart Import

  1. Upload CSV/XLSX → S3 → SQS queue
  2. Lambda parses headers and sample rows
  3. Claude API interprets columns, proposes field mapping JSON
  4. User confirms/adjusts the mapping
  5. FastAPI validates and bulk-inserts with error report on partial failures

V2 — Licence Scanning

  1. Mobile camera → upload to S3 via presigned URL
  2. Lambda invokes AWS Textract
  3. Claude API interprets raw Textract output → name, licence number, expiry, categories
  4. User confirms before saving to personnel.licences JSONB

10. Security and Compliance

Target frameworks

Framework Target date Status
Cyber Essentials Phase 3 (Week 38) Planned
Cyber Essentials Plus Month 10 Planned
SOC 2 Type I Month 12 Planned
ISO 27001 Month 18 Planned
SOC 2 Type II Month 24 Planned

Key controls

Control Implementation
Encryption in transit TLS 1.2+ everywhere; HTTP → HTTPS redirect
Encryption at rest RDS, S3, EBS encrypted via KMS CMK
Network segmentation 3-tier VPC; SGs allow minimum required traffic
WAF CloudFront WAF with AWS Managed Rules (OWASP Core, Known Bad Inputs)
Secrets management All secrets in AWS Secrets Manager; automatic rotation
Patch management ECR image scanning; Dependabot; Fargate managed patches
Audit logging Immutable audit trail on all entity writes; CloudTrail; 7-year retention
Backup RDS automated backups (7-day); S3 versioning; PITR tested quarterly
MFA Enforced for all admin roles via Cognito
Pen testing Annual external test; findings tracked to closure

GDPR

  • UK tenants: eu-west-2. EU tenants: eu-central-1. No transfer outside UK/EU.
  • Tenant data export: JSON export of all data for a named person (right of access)
  • Tenant data deletion: right to erasure — triggers anonymisation workflow
  • Data retention: configurable per tenant; nightly Lambda applies retention policies
  • DPA: Data Processing Agreement template, signed before any tenant goes live

11. Phased Delivery Plan (Summary)

Phase Weeks Goal
Phase 0 — Foundation 1–6 CI/CD, AWS infrastructure, auth end-to-end, tenant provisioning
Phase 1 — Core Modules 7–22 All 8 V1 modules, offline sync, tested by product owner
Phase 2 — Production Ready 23–32 White-label, ERP integration, security hardening, pilot tenants
Phase 3 — Launch 33–40 Pilot feedback, mobile, Cyber Essentials, commercial launch

For the full sprint-by-sprint breakdown, see Project Plan.


12. Monorepo Structure

platform/
├── apps/
│   ├── api/                  # FastAPI backend
│   │   ├── src/
│   │   │   ├── core/         # Auth, tenant context, db, audit
│   │   │   ├── modules/      # Sites, personnel, plant, etc.
│   │   │   └── main.py
│   │   ├── tests/
│   │   ├── alembic/
│   │   └── Dockerfile
│   └── web/                  # React frontend
│       ├── src/
│       │   ├── core/
│       │   └── modules/
│       └── vite.config.ts
├── packages/
│   ├── types/                # Auto-generated TypeScript from OpenAPI spec
│   ├── db/                   # SQLAlchemy models, Alembic migrations
│   └── shared/               # Constants, enums
├── infra/
│   └── terraform/            # All AWS infrastructure
│       ├── modules/
│       └── environments/     # dev, staging, prod
├── .github/
│   └── workflows/            # CI/CD pipelines
└── docs/                     # This documentation site

13. Key Risks

Risk Likelihood Impact Mitigation
Junior accepts AI code without understanding it High Critical Teach-back rule; weekly code walk-through; AI development rules
Offline sync conflicts cause data corruption Medium High PowerSync handles at engine level; soft-delete only
Schema-per-tenant hits RDS connection limits High Medium PgBouncer from day one; RDS Proxy at scale
Tenant data leakage via query bug Low Critical Schema isolation; cross-tenant tests in CI; parameterised queries only
Entra ID federation misconfiguration Low High Test in staging before prod; second review of attribute mapping
AWS costs spike due to misconfiguration Medium Medium Budgets at 80%/100%; junior cannot provision without approval
Compliance reveals architectural rework Low High Audit logging and isolation built into Phase 0