Triq Engineering February 1, 2025

Row-Level Security as a Multi-Tenancy Primitive

Why Basalt uses Postgres RLS for tenant isolation instead of relying on application-level filtering — and what that means for security.

Multi-tenancy is not a UI feature. It is a security boundary. When a platform stores virtual machines, networks, licenses, API keys, image references, audit events, and support records for many organizations in the same control plane, every query that touches tenant-owned data is part of that boundary.

Basalt enforces that boundary with Postgres Row-Level Security. Tenant isolation is not implemented as a convention that every application query must remember to follow. It is a database rule: tables containing tenant data have RLS policies that restrict rows by tenant_id using current_setting('app.current_tenant'). The gateway sets app.current_tenant on each database connection before executing tenant-scoped queries. After that, Postgres decides which rows are visible.

That changes the failure mode. With application-layer filtering, the security model depends on every query including the right predicate:

WHERE tenant_id = $1

One missed predicate turns into a cross-tenant data leak. A new report endpoint, an admin helper, a background task, or a JOIN added during a refactor can accidentally return rows for every tenant. Code review can reduce that risk, but it cannot remove it. Tests can catch examples, but they rarely cover every future query shape.

With RLS, even a careless SELECT * FROM virtual_machines is evaluated through the current tenant policy. The application can still have bugs, but the database will not return rows outside the active tenant context. Basalt treats that property as a primitive, not an optional hardening layer.

What RLS prevents

The obvious attack is direct data exposure: Tenant A seeing Tenant B’s VMs, networks, licenses, or tickets because an endpoint forgot a WHERE tenant_id = ?. RLS prevents that because the row is never visible to the query executor for Tenant A.

The less obvious attacks are often more important. Aggregate endpoints can leak counts or operational hints: how many licenses another tenant has, whether a product exists in another account, whether a host placement record references a tenant’s workload, or whether a support ticket ID is valid. Joins are another common failure path. A query that correctly filters the primary table but joins to an unfiltered related table can expose metadata from the wrong tenant. RLS applies at the table boundary, so each protected table enforces its own policy regardless of how the query is assembled.

RLS also reduces the blast radius of injection and authorization bugs. SQL injection must still be prevented; parameterized queries and safe query construction remain mandatory. But if an attacker manages to change the shape of a tenant-scoped SELECT, the database policy still constrains visible tenant rows. Similarly, if an application handler incorrectly allows a user to call an endpoint, the query still runs inside the tenant context set by the gateway.

This is defense in depth with a concrete enforcement point. The gateway authenticates the caller, resolves the tenant, sets app.current_tenant, and executes queries through a connection configured for that tenant. Postgres enforces row visibility. The application does not get to choose whether tenant isolation applies on a per-query basis.

RLS is not RBAC

RLS answers one question: which tenant’s rows can this session see? It does not answer whether the current user may reboot a VM, issue a license, rotate an API key, or read an audit log. Basalt layers RBAC on top of RLS for that reason.

The platform defines roughly two hundred fine-grained permissions. Those permissions describe actions and resource capabilities: read a project, create a VM, manage a license, view support tickets, operate registry credentials, administer tenant users, or inspect audit events. RBAC is evaluated in the gateway before actions are performed. RLS remains active underneath it.

The combination matters. RBAC prevents a user inside Tenant A from performing actions they are not authorized to perform. RLS prevents any path, authorized or buggy, from crossing into Tenant B’s data. They are different controls with different failure modes. Combining them means a mistake in one layer does not automatically collapse the other.

Token and secret handling follows the same pattern of explicit boundaries. Basalt uses FIPS 140-3 cryptography through aws-lc-rs for token and secret operations. That does not replace tenant isolation or RBAC, but it keeps authentication material and signed transfer flows aligned with the platform’s security posture. Secrets should not be exposed across tenants, and the cryptographic primitives used to protect them should not be improvised per feature.

Why not one database per tenant?

Separate databases per tenant create a strong isolation story, but they also create an operations model that grows with tenant count. Every tenant means another database to provision, migrate, back up, monitor, restore, and include in cross-version upgrade testing. Schema changes become fleet operations. Reporting and control-plane maintenance require fan-out across many databases. Small tenants cost almost as much operationally as large tenants.

For some products, that trade-off is correct. If tenants require customer-managed keys with isolated database clusters, strict data residency per account, or contractual single-tenant deployment, separate databases can be the right primitive. Basalt’s default model optimizes for a shared infrastructure control plane where strong logical isolation, consistent schema evolution, and operational simplicity all matter.

A single Postgres database with RLS gives Basalt one schema, one migration path, one backup strategy, and one place to enforce relational integrity across the control plane. Tenant rows still carry tenant identity. Policies still restrict visibility. Operators do not have to coordinate hundreds of database upgrades just to add a field to a task record or an audit event.

The trade-offs are real

RLS is not free. Every protected query pays the cost of policy evaluation. Good indexes on tenant_id are not optional; they are part of the security design and the performance design. Query plans must be inspected with the policy in mind, because the effective predicate includes more than what appears in application code.

RLS also forces disciplined schema design. Tenant-owned tables need clear ownership. Shared reference tables need explicit decisions: are they global, tenant-scoped, or both? Cross-tenant administrative workflows cannot casually run under a tenant context. They require explicit policy bypass, such as using a dedicated role and controlled SET ROLE paths, with auditability around why that broader view was needed.

Testing changes as well. Tests must verify that policies do not drift when tables are added or relationships change. It is not enough to test that the happy-path query returns a VM for the right tenant. The suite must also prove that the same query cannot see another tenant’s VM, and that joins do not accidentally bypass isolation. RLS moves enforcement into the database; the tests have to treat database policy as production code.

Those costs are acceptable because the alternative is worse. Application-layer filtering makes tenant isolation a habit. Basalt makes it a database invariant. When the gateway sets app.current_tenant, Postgres becomes an active participant in the security model, not a passive store waiting for the application to remember every predicate.

For an infrastructure platform, that distinction is fundamental. Tenant isolation has to survive new endpoints, refactors, admin tooling, background jobs, and tired humans debugging incidents at 2 a.m. Row-Level Security gives Basalt a boundary that is always in the query path, exactly where the data lives.