Security & RBAC

RBAC & permissions

Authorization on the DataFlow AI Platform is governed by a hierarchical role-based access-control (RBAC) model. The catch: there are three different role vocabularies — Keycloak realm roles, the backend DataFlowRole hierarchy, and the frontend personas — that must be reconciled by hard-coded mapping tables. This page documents all three, the full permission matrices, how enforcement works, and the known gaps.


Three role vocabularies

A single human user is described by three different role names depending on which layer you are looking at. Understanding the mapping between them is essential to reasoning about access.

LayerVocabularySource
Identity providerKeycloak realm rolesrealm-export.json
Backend servicesDataFlowRole hierarchyRBACService.kt (common module)
Frontend SPAPersonas (PersonaId)frontend/src/types/permissions.ts

1. Keycloak realm roles

The dataflow realm ships six realm roles: org_admin, workspace_admin, developer, analyst, operator, and viewer. org_admin is a composite role that includes the other five.

2. Backend DataFlowRole hierarchy

RBACService.DataFlowRole defines five ordered roles, each with a numeric level:

ADMIN(100) > ENGINEER(75) > ANALYST(50) > STEWARD(40) > VIEWER(25)

A role grants any permission whose requiredRole.level is less than or equal to the user's role level (RBACService.hasPermission).

3. Frontend personas

The SPA models four personas (PersonaId): admin, engineer, analyst, and steward. Each persona maps to a fixed set of permissions and a list of allowed route prefixes.

The vocabularies do not line up

Keycloak ships developer, operator, and org_admin — there is no STEWARD realm role and no ENGINEER realm role. The backend and frontend, by contrast, expect engineer and steward. A "steward" in Keycloak can only arrive via an AD group (PLK-BI-Stewards) or a data_steward role name, neither of which is in the shipped realm export. The seeded example user "Tomasz Zielinski / Data Steward" actually carries the realm role viewer.


Mapping tables

Keycloak groups → realm roles

The realm export maps four corporate-style groups onto realm roles:

GroupRealm role granted
/PLK-BI-Adminsorg_admin
/PLK-BI-Engineersdeveloper
/PLK-BI-Analystsanalyst
/PLK-BI-Operationsoperator

KeycloakJwtConverter and RBACService additionally know about PLK-BI-Stewards and PLK-BI-Viewers, but those groups are not present in the shipped realm export.

Seeded users

UsernameRealm roleGroupTitle attribute
anna.kowalska@polkomtel.pldeveloperPLK-BI-EngineersData Engineer
marek.nowak@polkomtel.planalystPLK-BI-AnalystsBusiness Analyst
katarzyna.wisniewski@polkomtel.plorg_adminPLK-BI-AdminsPlatform Admin
tomasz.zielinski@polkomtel.plviewer(none)Data Steward

All four seeded users have empty credentials and requiredActions: [UPDATE_PASSWORD].

Any raw role → DataFlowRole

The gateway maps every raw role string it sees — realm roles, client roles, group names, and group paths — onto a single DataFlowRole (RBACService.mapSingleRole / KeycloakJwtConverter). The highest mapped role wins; the default is VIEWER.

Raw role / groupResolved DataFlowRole
PLK-BI-Admins, /PLK-BI-AdminsADMIN
admin, dataflow-admin, org_admin, workspace_admin, platform_adminADMIN
PLK-BI-EngineersENGINEER
engineer, dataflow-engineer, data_engineer, developer, operatorENGINEER
PLK-BI-AnalystsANALYST
analyst, dataflow-analyst, business_analystANALYST
PLK-BI-StewardsSTEWARD
steward, dataflow-steward, data_stewardSTEWARD
PLK-BI-ViewersVIEWER
viewer, dataflow-viewerVIEWER
ROLE_*-prefixed namesSame mapping after the prefix is stripped
Unrecognised roleSkipped — no implicit grant (FR-020)

KeycloakJwtConverter.extractAuthorities converts the resolved role into Spring ROLE_* authorities. If a JWT has no mappable roles, the authority set is empty and the request is rejected by @PreAuthorize — a deliberate anti-privilege-escalation choice. The org_admin / developer / operator mappings were added late to fix users who were otherwise silently downgraded to ANALYST or VIEWER.

Why the mapping defaults matter

The "highest wins, default VIEWER, unrecognised skipped" rules mean an unknown role string never escalates privilege — but a misspelled expected role silently downgrades a user. The mapping tables are the single fragile point reconciling all three vocabularies.


Backend permission matrix

RBACService defines a catalogue of 30 permissions, each tagged with a requiredRole. Because the model is hierarchical, a role receives a permission when its level is the permission's required-role level.

Permissions by domain

DomainPermission → required role
PipelineVIEW_PIPELINE→VIEWER; CREATE/EDIT/RUN/SCHEDULE_PIPELINE→ENGINEER; DELETE_PIPELINE→ADMIN
ConnectionVIEW_CONNECTION→VIEWER; CREATE/EDIT/TEST_CONNECTION→ENGINEER; DELETE_CONNECTION→ADMIN
MonitoringVIEW_MONITOR→VIEWER; ACKNOWLEDGE_ALERT→ANALYST; CONFIGURE_ALERT→ENGINEER; DELETE_ALERT→ADMIN
LineageVIEW_LINEAGE→VIEWER; EDIT_LINEAGE→ENGINEER
MigrationVIEW_MIGRATION→VIEWER; CREATE_MIGRATION→ENGINEER; EXECUTE_MIGRATION→ADMIN
AI CopilotUSE_COPILOT→ANALYST; CONFIGURE_COPILOT→ADMIN
WorkspaceVIEW_WORKSPACE→VIEWER; EDIT_WORKSPACE/MANAGE_MEMBERS→ADMIN
User managementVIEW_USERS→ANALYST; MANAGE_USERS→ADMIN
AuditVIEW_AUDIT_LOG→ANALYST; EXPORT_AUDIT_LOG→ADMIN
SystemVIEW_SYSTEM_SETTINGS/MODIFY_SYSTEM_SETTINGS→ADMIN

Roles × permissions (hierarchical)

= granted. Levels: ADMIN 100, ENGINEER 75, ANALYST 50, STEWARD 40, VIEWER 25.

Permission tier (required role)ADMINENGINEERANALYSTSTEWARDVIEWER
VIEW_PIPELINE / VIEW_CONNECTION / VIEW_MONITOR / VIEW_LINEAGE / VIEW_MIGRATION / VIEW_WORKSPACE (VIEWER)
ACKNOWLEDGE_ALERT / USE_COPILOT / VIEW_USERS / VIEW_AUDIT_LOG (ANALYST)
CREATE/EDIT/RUN/SCHEDULE_PIPELINE / CREATE/EDIT/TEST_CONNECTION / CONFIGURE_ALERT / EDIT_LINEAGE / CREATE_MIGRATION (ENGINEER)
DELETE_PIPELINE / DELETE_CONNECTION / DELETE_ALERT / EXECUTE_MIGRATION / CONFIGURE_COPILOT / EDIT_WORKSPACE / MANAGE_MEMBERS / MANAGE_USERS / EXPORT_AUDIT_LOG / VIEW_SYSTEM_SETTINGS / MODIFY_SYSTEM_SETTINGS (ADMIN)

STEWARD hierarchy inversion

STEWARD sits at level 40, below ANALYST at level 50. A STEWARD therefore inherits only VIEWER-level hierarchical permissions from RBACService. Any real steward authority comes exclusively from controllers that name STEWARD explicitly in their @PreAuthorize lists — not from the role hierarchy. The frontend persona model (below) instead gives the steward broad governance rights, so the two models genuinely disagree about how powerful a steward is.

Workspace-aware checks

RBACService also exposes workspace-scoped checks layered on top of the permission checks:

  • canAccessWorkspace — ADMIN sees all workspaces; the default workspace is open to everyone; any other workspace requires membership.
  • canModifyPipeline, canDeletePipeline, canRunPipeline — each evaluates the relevant permission and the workspace check.

Frontend permission matrix

The frontend permission catalogue (frontend/src/types/permissions.ts) defines 28 permissions as domain:action strings spanning pipeline, quality, governance, lineage, code, admin, system, and catalog domains. (The file's header comment says "26"; the actual list is 28.) Permissions are assigned per persona via personaPermissions, and the model is described as backed server-side by the metadata-service PolicyEngine.kt.

Persona × permissions

= granted. The admin persona holds all 28 permissions.

Permissionengineeranalystadminsteward
pipeline:view
pipeline:create
pipeline:edit
pipeline:run
pipeline:delete
pipeline:deploy
pipeline:debug
quality:view
quality:create
quality:edit
quality:evaluate
governance:policy
governance:review
governance:approve
lineage:view
lineage:edit
code:view
code:review
code:commit
admin:users
admin:workspace
admin:security
admin:infrastructure
admin:billing
system:settings
catalog:view
catalog:edit
catalog:classify

Note how the frontend steward persona holds rich governance:*, quality:*, lineage:edit, and catalog:* permissions — exactly the authority the backend hierarchy denies a STEWARD (level 40). This divergence is one of the known gaps.


How enforcement works

Authorization is enforced in two layers: the gateway checks token validity and role presence; each service applies fine-grained method-level and workspace/ownership checks.

 Request ──▶ API Gateway ──▶ Downstream service ──▶ @PreAuthorize ──▶ workspace check
              │                  │                       │
              │ token valid?     │ JWT re-validated      │ method-level
              │ role present?    │ X-User-* → context    │ role/permission
              ▼                  ▼                       ▼
        coarse HTTP rule    SecurityContext        fine-grained grant

Gateway-level rules

When dev-permit-reads=false, the gateway's reactive SecurityConfig applies coarse method-based rules to /api/v1/**:

HTTP methodRequired authority
DELETEROLE_ADMIN
POST / PUT / PATCHROLE_ADMIN or ROLE_ENGINEER
GETROLE_ADMIN, ROLE_ENGINEER, ROLE_ANALYST, ROLE_STEWARD, or ROLE_VIEWER
Any other exchangeAuthenticated

When dev-permit-reads=true, all GETs plus the copilot / ai / catalog-ask / search POSTs become permitAll.

Service-level @PreAuthorize

Downstream services run @EnableMethodSecurity(prePostEnabled = true), so controllers carry @PreAuthorize annotations. Because KeycloakJwtConverter produces real ROLE_* authorities, hasRole / hasAnyRole evaluate genuine Keycloak roles. A representative sample of observed guards:

Service / controllerEndpoint groupGuard
lineage-service BiLineageControllerBI lineagehasAnyAuthority('ROLE_ADMIN','ROLE_STEWARD')
lineage-service PropagationControllerpropagationhasRole('ADMIN')
lineage-service LineageAuthoringControllerauthoringhasAnyRole('STEWARD','ADMIN')
monitor-service AdminScheduledReportsControllerscheduled reportshasAnyRole('ADMIN','STEWARD')
metadata-service WorkspaceControllerworkspace CRUDhasRole('ADMIN') (all methods)
metadata-service UserControlleruser managementhasRole('ADMIN') (all methods)
metadata-service AuditLogControlleraudit loghasRole('ADMIN')
metadata-service ConnectionControllerconnectionsread = 5-role; create/edit/test = ADMIN/ENGINEER; delete = ADMIN
metadata-service CdrMetricsControllerCDR metricsread = 5-role; write = ADMIN/ENGINEER
metadata-service PiiClassifierControllerPII classifyscan = ADMIN/ANALYST/STEWARD/ENGINEER; mutate = ADMIN/STEWARD
metadata-service TagControllertag policyhasAnyRole('governance-admin','data-steward','admin') or hasAuthority('SCOPE_governance:policy')
metadata-service SearchControllersearch / reindexsearch = permitAll(); reindex = hasAnyRole('ADMIN','PLATFORM_OPS')
pipeline-engine ErasureControllerGDPR erasurehasAnyRole('ADMIN','STEWARD')
pipeline-engine QuarantineControllerquarantineread = ADMIN/DATA_STEWARD/DATA_ENGINEER; mutate = ADMIN/DATA_STEWARD; release = ADMIN

Frontend route guards

Frontend RBAC is UX only — the SPA is a public client and real enforcement is always server-side. Still, the frontend prevents users from navigating to pages they cannot use:

  • ProtectedRoute shows a spinner while isLoading, redirects to /login if !isAuthenticated, then calls canAccess(effectivePersona, pathname) and renders AccessDenied on failure. effectivePersona is the persona store in dev mode, otherwise user.persona.
The Access Denied screen shown when a persona navigates to a route outside its allowed prefixes
The Access Denied screen — rendered by ProtectedRoute when canAccess returns false, e.g. an analyst typing the /admin URL directly. This is a UX guard only; the gateway and services enforce the same boundary server-side.
  • Route-prefix RBACdata/permissions.ts defines personaRoutes; canAccess matches an allowed prefix exactly or as prefix + '/'. This map is locked by permissions-rbac.test.ts (65 assertions).
  • PermissionGate is a component-level gate built on the usePermission / usePermissions / useAnyPermission hooks, supporting single or multiple permissions with requireAll and an optional fallback.
  • useAuth().hasRole(role) checks realmRoles; useAuth().hasPermission(perm) checks the persona-derived permissions array.

Allowed route prefixes per persona

PersonaAllowed route prefixes
engineer/, /design-studio, /monitor, /governance, /migration, /connections, /data-marketplace, /marketplace, /my-pipelines, /pipelines, /data-browser, /templates, /admin/incidents, /telecom
analyst/, /my-pipelines, /data-browser, /data-marketplace, /marketplace, /monitor, /governance/quality, /governance/lineage
admin/, /design-studio, /monitor, /governance, /migration, /admin, /connections, /my-pipelines, /pipelines, /data-browser, /templates, /data-marketplace, /marketplace, /telecom
steward/, /monitor, /governance, /data-browser, /data-marketplace, /marketplace, /admin/audit-log, /admin/access-reviews, /admin/incidents

Known risks & gaps

The research flagged several authorization-correctness issues worth tracking.

Mismatched @PreAuthorize roles

Several controllers guard endpoints with role names that a real Keycloak token can never carry. KeycloakJwtConverter only ever emits the five canonical authorities ROLE_ADMIN, ROLE_ENGINEER, ROLE_ANALYST, ROLE_STEWARD, and ROLE_VIEWER. Guards such as:

  • hasRole('DATA_STEWARD') and hasRole('DATA_ENGINEER')QuarantineController
  • hasRole('PLATFORM_OPS')SearchController reindex
  • hasAnyRole('governance-admin','data-steward')TagController

...would never match a genuine user. Those endpoints are effectively inaccessible to real users — they only become reachable in dev-permit-reads mode, where the anonymous user is granted a broad role set that happens to include ROLE_DATA_STEWARD / ROLE_DATA_ENGINEER. This is a real authorization-correctness defect.

STEWARD hierarchy inversion

As noted above, STEWARD (level 40) ranks below ANALYST (level 50) in the backend hierarchy, so a STEWARD inherits only VIEWER-level permissions from RBACService — while the frontend persona model grants the steward broad governance, quality, lineage, and catalog rights. The two models disagree on how powerful a steward is, and STEWARD authority on the backend depends entirely on explicit @PreAuthorize lists rather than the hierarchy.

The dev-permit-reads escape hatch

dataflow.gateway.dev-permit-reads (default false) is a powerful switch. When true it permits all GETs unauthenticated and grants the downstream anonymous user a wide role set (ROLE_ADMIN, ROLE_DPO, ROLE_COMPLIANCE_OFFICER, ROLE_AUDITOR, ROLE_STEWARD, ROLE_OPS, ROLE_ENGINEER, ROLE_USER, ROLE_DATA_ENGINEER, ROLE_DATA_STEWARD). It must be false in production, and no environment guard was found that enforces this.

Other observations

  • Role-vocabulary fragmentation — three role systems reconciled only by hard-coded mapping tables in RBACService, KeycloakJwtConverter, and keycloak.ts. The org_admin / developer mappings were a late bug fix.
  • Doc drift — the admin guide describes a MANAGER role and a dataflow-ui client that exist in neither the code nor the realm export.

Net effect for operators

When auditing access, never assume the three role models agree. Verify against the actual RBACService hierarchy and the KeycloakJwtConverter mapping table — and confirm dev-permit-reads is false in any production deployment.


Summary

TopicKey fact
VocabulariesKeycloak realm roles, backend DataFlowRole, frontend personas — reconciled by mapping tables
Backend hierarchyADMIN(100) > ENGINEER(75) > ANALYST(50) > STEWARD(40) > VIEWER(25)
Backend permissions30 permissions, granted when role level ≥ required-role level
Frontend permissions28 domain:action permissions assigned per persona
EnforcementGateway coarse HTTP rules + service @PreAuthorize + workspace checks
Frontend RBACUX-only route guards and PermissionGate — real enforcement is server-side
Top risksUnmatched @PreAuthorize roles, STEWARD inversion, dev-permit-reads escape hatch

For the authentication side of the picture — how an identity is established before any of these checks run — see Authentication & SSO.

Previous
Authentication & SSO