Security & Compliance

This page describes the security posture of the Scaleout Helm chart (charts/scaleout) and its reference GKE deployment: the controls it implements and enforces, their secure defaults, how to verify them, and the residual risks.

The chart is secure by default — workload hardening, secret management, resource governance and input validation are enabled out of the box, and the chart runs under the Kubernetes Pod Security Standards restricted profile. Network segmentation and edge TLS are a single flag each.

Architecture & trust boundaries

(TLS, cert-manager)         ┌──────── Kubernetes namespace ────────┐
Edge clients ─────────────► │  Ingress (nginx)                      │
Browsers / CLI / FL clients │   /api  /  /kratos /hydra (public)    │
                            │     │                                 │
Edge FL clients ── gRPC ──► │  combiner ◄─► controller ◄─► hooks    │
(JWT-authenticated)         │     api-server                        │
                            │     │  (NetworkPolicy: default-deny)   │
                            │     ▼                                 │
                            │  postgres / mongo / minio (data tier) │
                            └───────────────────────────────────────┘
  • Cluster edge — all external HTTP enters via the Ingress over TLS. The Kratos/Hydra public APIs are exposed; their admin APIs are reachable only in-cluster.

  • Authentication — when enabled, browser/API auth is enforced by Ory Kratos (identities / sessions) and Hydra (OAuth2/OIDC); gRPC endpoints enforce JWT.

  • External FL clients → combiner — gRPC entering from outside the cluster, authenticated by JWT when auth is enabled. FL clients using the EdgeClient connect with TLS; ensure the combiner secureMode is configured accordingly.

  • Intra-cluster / data tier — segmented by NetworkPolicies (default-deny ingress).

Controls

Framework references: Pod Security Standards (PSS), the CIS Kubernetes Benchmark (CIS K8s), SOC 2 Trust Services Criteria, and ISO/IEC 27001:2022 Annex A.

Control (as implemented)

Default

Framework mapping

Run as non-root (runAsNonRoot, runAsUser ≠ 0) — all workloads

on

PSS Restricted; CIS K8s 5.2.6; SOC 2 CC6.1/CC6.3; ISO A.8.2/A.8.3

No privilege escalation (allowPrivilegeEscalation: false)

on

PSS Restricted; CIS K8s 5.2.5; SOC 2 CC6.1; ISO A.8.2

Drop ALL Linux capabilities

on

PSS Restricted; CIS K8s 5.2.8/5.2.9; SOC 2 CC6.1; ISO A.8.2

Seccomp RuntimeDefault — all pods

on

PSS Restricted; SOC 2 CC6.1/CC6.8; ISO A.8.2/A.8.31

Read-only root filesystem — app containers (writable scratch via emptyDir)

on [1]

PSS Restricted (rec.); SOC 2 CC6.1/CC6.8; ISO A.8.2

No service-account token automount

on

CIS K8s 5.1.5/5.1.6; SOC 2 CC6.1/CC6.3; ISO A.8.2/A.8.3

Network segmentation — default-deny ingress + least-privilege allows

opt-in [2]

CIS K8s 5.3.2; SOC 2 CC6.1/CC6.6; ISO A.8.20/A.8.22

TLS in transit at the edge — cert-manager-issued Ingress certificates

opt-in

SOC 2 CC6.7; ISO A.8.24

Authentication & authorization — Ory Kratos + Hydra (OIDC/OAuth2), gRPC JWT

opt-in

SOC 2 CC6.1/CC6.2/CC6.3; ISO A.5.15/A.5.17/A.8.5

Secrets management — secretKeyRef; generated+retained or existingSecret; per-install Kratos JWKS; admin password never in plaintext env; external-secrets / sealed-secrets compatible

on

CIS K8s 5.4.1; SOC 2 CC6.1/CC6.3; ISO A.8.24/A.5.10

Resource requests/limits — every container

on

SOC 2 A1.1 (Availability); ISO A.8.6

Supply-chain integrity — optional image digest pinning; CI SAST (CodeQL), image scanning (Trivy), SBOM (Syft) + vulnerability report (Grype)

partial [3]

SOC 2 CC7.1/CC8.1; ISO A.8.8/A.8.28/A.8.30

Input validation — values.schema.json validates types/enums at install

on

SOC 2 CC8.1; ISO A.8.25

Least-exposure ingress — only public auth APIs routed; admin APIs in-cluster only

on

SOC 2 CC6.6; ISO A.8.20/A.8.21

Secure defaults & configuration

Concern

Default

How to harden / enable

Pod hardening

enforced

n/a (on for all workloads)

NetworkPolicies

off

networkPolicy.enabled=true (needs a policy CNI)

Edge TLS

off (http)

global.protocol=https + ingress.certManager.clusterIssuer

Authentication

per chart default

Kratos/Hydra; set auth.admin.* to bootstrap an admin

Secrets

auto-generated + retained

secrets.existingSecret with external-secrets / sealed-secrets

Image pinning

tags

image.coreDigest / image.frontendDigest

Bundled data backends

in-cluster (dev)

point at managed/hardened services (*.deploy=false)

Data protection

  • In transit (edge): TLS terminates at the Ingress (cert-manager); auth admin APIs are never exposed externally.

  • In transit (intra-cluster): currently plaintext within the cluster; transparent pod-to-pod and database mTLS is on the roadmap. NetworkPolicies restrict reachability in the meantime.

  • At rest: persistent data (Postgres/Mongo/MinIO) is on PersistentVolumes — encryption at rest is provided by the cluster/cloud storage class (e.g. GKE encrypts persistent disks by default). Enable etcd encryption at rest / CMEK at the cluster level for Secrets. For production, prefer managed data backends and an external secret store over the bundled ones.

  • Secrets exposure: credentials are referenced via secretKeyRef / envFrom, never baked into manifests or images.

Verification

# Restricted securityContext on every workload
kubectl -n <ns> get pods -o jsonpath='{range .items[*]}{.metadata.name}{": runAsNonRoot="}{.spec.securityContext.runAsNonRoot}{"\n"}{end}'

# No service-account token mounted
kubectl -n <ns> get pod <pod> -o jsonpath='{.spec.automountServiceAccountToken}'   # false

# NetworkPolicy enforcement — a pod in another namespace is blocked from the data tier
kubectl -n other run probe --image=busybox --restart=Never --command -- \
  sh -c 'nc -w5 -z <release>-postgres.<ns> 5432 && echo OPEN || echo BLOCKED'      # BLOCKED

# TLS certificate issued for the host
kubectl -n <ns> get certificate

# Automated end-to-end (auth off + on, helm test, cross-namespace block)
charts/scaleout/test/e2e.sh --mode both

The chart’s helm test hook and test/e2e.sh are run during release validation and are reproducible on any cluster (validated on GKE with Dataplane V2).

Residual risks & roadmap

  • Intra-cluster traffic (including database connections) is not encrypted. NetworkPolicies limit reachability; transparent mTLS via a service mesh is planned.

  • External FL client → combiner gRPC transport — authenticated by JWT (when auth is enabled); EdgeClient connects with TLS by default; ensure combiner secureMode matches.

  • Core read-only root filesystem is smoke-tested but not validated under a full federated-training round; a toggle is available (securityContext.readOnlyRootFilesystem).

  • Secret rotation — generated secrets are intentionally retained across upgrades. To rotate: update the value in the chart-managed Secret directly, or use secrets.existingSecret with an external secret manager (external-secrets / sealed-secrets).

  • Bundled data backends are development-grade — use managed/hardened backends in production.

Supply chain & vulnerability management

  • Images can be pinned by digest for immutability.

  • CI pipelines run CodeQL (SAST), Trivy (container image scanning), and Syft (SBOM) + Grype (vulnerability report) published per release.

  • The chart consumes images from a private registry; restrict pulls with imagePullSecrets.

The same content is maintained in the chart repository as charts/scaleout/SECURITY.md for engineers working in the codebase.