API reference
API reference: overview
The DataFlow AI Platform exposes a microservice REST API fronted by a single API Gateway. Every client request enters through the gateway, which authenticates the JWT, applies per-route rate limiting, attaches a correlation ID, and transparently proxies the call to the backend service that owns the route prefix.
Base URLs
All client traffic should target the gateway. Backend services are also reachable directly in local development for debugging.
| Environment | Base URL |
|---|---|
| Production (gateway) | https://api.dataflow.polkomtel.pl |
| Local gateway | http://localhost:8080 |
| Local metadata (direct) | http://localhost:8081/api/v1 |
| Local pipeline-engine (direct) | http://localhost:8082/api/v1 |
| Local lineage (direct) | http://localhost:8084/api/v1 |
| Local monitor (direct) | http://localhost:8085/api/v1 |
Every backend service also publishes the production server https://api.dataflow.polkomtel.pl/api/v1 — that is, the path you reach through the gateway. Every functional endpoint is prefixed with /api/v1.
Use the gateway
In production always call the gateway. Direct service ports are firewalled and intended only for local development. The gateway is what adds authentication enforcement, rate limiting, and the correlation headers documented below.
Services and ports
The platform is built from five OpenAPI-specified services plus three proxied-only services. All specs declare openapi: 3.1.0 and info.version: 1.0.0.
| Service | Spec file | Local port | Responsibility |
|---|---|---|---|
| API Gateway | openapi-gateway.yaml | 8080 | Auth, rate limiting, routing, SSE/WS passthrough |
| Metadata Service | openapi-metadata.yaml | 8081 | Pipelines, connections, quality, catalog, governance, GDPR, admin |
| Pipeline Engine | openapi-pipeline-engine.yaml | 8082 | Execution, scheduling, orchestration, Git, Flink |
| Lineage Service | openapi-lineage.yaml | 8084 | Dataset/column lineage, impact analysis, OpenLineage events |
| Monitor Service | openapi-monitor.yaml | 8085 | Dashboard metrics, alerts, alert definitions/events, SSE |
| Copilot Service | proxied | 8086 | AI copilot |
| Migration Service | proxied | 8087 | Migration plans |
| Connector SDK | proxied | 8088 | CDC connectors |
Gateway routing
The gateway maps each route prefix to a backend service. Most gateway paths are $ref-mounted re-exports of the backend specs; the gateway also defines six first-class operations of its own (health, copilot, migration, CDC).
| Route prefix | Backend service | Port |
|---|---|---|
/api/v1/pipelines/** | metadata-service | 8081 |
/api/v1/connections/** | metadata-service | 8081 |
/api/v1/quality/** | metadata-service | 8081 |
/api/v1/catalog/** | metadata-service | 8081 |
/api/v1/governance/** | metadata-service | 8081 |
/api/v1/gdpr/** | metadata-service | 8081 |
/api/v1/admin/** | metadata-service | 8081 |
/api/v1/runs/** | pipeline-engine | 8082 |
/api/v1/scheduler/** | pipeline-engine | 8082 |
/api/v1/orchestrator/** | pipeline-engine | 8082 |
/api/v1/git/** | pipeline-engine | 8082 |
/api/v1/flink/** | pipeline-engine | 8082 |
/api/v1/runs/*/stream | pipeline-engine (WebSocket) | 8082 |
/api/v1/lineage/** | lineage-service | 8084 |
/api/v1/monitor/** | monitor-service | 8085 |
/api/v1/monitor/sse/** | monitor-service (SSE) | 8085 |
/api/v1/ai/** | copilot-service | 8086 |
/api/v1/migration/** | migration-service | 8087 |
/api/v1/cdc/** | connector-sdk | 8088 |
Gateway-owned operations
These six operations are defined directly by the gateway rather than re-exported from a backend spec.
| Method | Path | Operation ID | Purpose | Auth |
|---|---|---|---|---|
| GET | /health | gatewayHealth | Gateway health check | None |
| GET | /api/v1/ai/copilot | copilotEndpoint | AI copilot (proxied to copilot-service) | JWT |
| GET | /api/v1/migration/plans | migrationPlans | Migration plans (proxied to migration-service) | JWT |
| GET | /api/v1/cdc/connectors | listCdcConnectors | List CDC connectors (proxied to connector-sdk) | JWT |
| POST | /api/v1/cdc/connectors | deployCdcConnector | Deploy a CDC connector | JWT |
| GET | /api/v1/cdc/health | cdcHealth | CDC health summary | JWT |
// GET /health → 200 OK
{
"status": "UP",
"gateway": "dataflow-api-gateway"
}
Versioning
The platform uses path-based versioning. The current and only version is v1, embedded in every functional route as /api/v1/....
- All five OpenAPI specs declare
info.version: 1.0.0. - The gateway
GET /healthendpoint is unversioned — it carries no/api/v1prefix. - A new major version, if introduced, would appear as a new path segment (
/api/v2/...);v1routes remain stable.
Authentication
The platform authenticates with Keycloak OIDC + JWT Bearer tokens.
- Security schemes. The gateway declares two schemes —
oidcAuth(openIdConnect) andbearerAuth(http/bearer/JWT). Backend services declare onlybearerAuth. - OIDC discovery:
https://auth.dataflow.polkomtel.pl/realms/dataflow/.well-known/openid-configuration - Realm:
dataflow - Client IDs:
dataflow-ui(public, Authorization Code + PKCE) anddataflow-api(confidential, Client Credentials). - Header:
Authorization: Bearer <token>
Required JWT claims
Every authenticated request must carry a token with the following claims.
| Claim | Description |
|---|---|
sub | User ID |
realm_access.roles | User roles — one or more of ADMIN, ENGINEER, ANALYST, VIEWER |
workspace_id | Active workspace (custom claim) |
GET /api/v1/pipelines HTTP/1.1
Host: api.dataflow.polkomtel.pl
Authorization: Bearer eyJhbGciOiJSUzI1NiIsImtpZCI6...
Exceptions
GET /health(gateway) is unauthenticated — its OpenAPIsecurityis[].- Webhook endpoints (orchestrator trigger, Git webhooks) authenticate with HMAC signature headers —
X-Webhook-Signature,X-Hub-Signature-256,X-Gitlab-Token— instead of a JWT.
Heads up
A missing or invalid token yields 401 UNAUTHORIZED. A valid token whose roles or workspace_id do not permit the operation yields 403 — for example, a cross-workspace pipeline dependency.
Error format
Every service returns a uniform ErrorResponse body on failure.
{
"error": "NOT_FOUND",
"message": "Pipeline not found: f47ac10b-58cc-4372-a567-0e02b2c3d479",
"timestamp": "2025-12-15T14:30:00Z"
}
| Field | Type | Notes |
|---|---|---|
error | string | Machine-readable code — BAD_REQUEST, UNAUTHORIZED, NOT_FOUND, RATE_LIMITED, GATEWAY_TIMEOUT, SERVICE_UNAVAILABLE |
message | string | Human-readable description |
timestamp | string (date-time) | Optional; present on most responses |
HTTP status codes
| Status | Meaning |
|---|---|
| 200 | OK |
| 201 | Created |
| 202 | Accepted — async (pipeline run / trigger) |
| 204 | No Content — delete |
| 400 | Bad request / invalid payload |
| 401 | Missing or invalid JWT |
| 403 | Forbidden — e.g. cross-workspace dependency |
| 404 | Resource not found |
| 409 | Conflict — duplicate user/cluster, merge conflict |
| 429 | Rate limited (gateway only) |
| 502 / 503 / 504 | Gateway-level upstream errors |
Gateway-specific errors
The gateway extends ErrorResponse for its own failure modes.
| Error | Status | Extra fields |
|---|---|---|
RateLimited | 429 | Adds retryAfter (integer); also sets X-RateLimit-Limit and X-RateLimit-Reset headers |
GatewayTimeout | 504 | error: "GATEWAY_TIMEOUT" |
ServiceUnavailable | 503 | error: "SERVICE_UNAVAILABLE"; body adds service |
// 429 Too Many Requests
{
"error": "RATE_LIMITED",
"message": "Rate limit exceeded for /api/v1/pipelines",
"retryAfter": 42,
"timestamp": "2025-12-15T14:30:00Z"
}
Rate limiting and correlation headers
The gateway applies these headers to every proxied response.
| Header | Description | Example |
|---|---|---|
X-RateLimit-Limit | Max requests per minute | 100 |
X-RateLimit-Remaining | Remaining requests in the window | 87 |
X-RateLimit-Reset | Unix timestamp of the window reset | 1734272400 |
X-Request-Id | Request correlation UUID for tracing | d1e2f3a4-5b6c-7d8e-9f0a-1b2c3d4e5f6a |
Rate limits are enforced per route. When the window is exhausted the gateway responds 429 with a retryAfter value (seconds). Always log the X-Request-Id on the client side — it is the single correlation key for support and tracing across all services.
Pagination
There is no platform-wide pagination contract. Three distinct patterns are used across the services; there is no cursor-based pagination anywhere.
1. Page/size offset pagination
Used by Admin (/admin/users, /admin/audit) and the GDPR DSAR list. Query params page (default 0) and size (default 20). The response wraps the items with total, page, and size.
// GET /api/v1/admin/users?page=0&size=20 → 200 OK
{
"items": [ /* UserResponse[] */ ],
"total": 134,
"page": 0,
"size": 20
}
2. limit-capped lists
Most "list / history / recent" endpoints accept a limit query param (defaults vary — 10, 50, 100). There is no cursor; the response is a bare array.
GET /api/v1/quality/results?pipelineId=...&limit=50
3. Time-window filters
Alert and event history endpoints filter by time rather than page number, using sinceSeconds, from, and to.
GET /api/v1/monitor/alerts/history?sinceSeconds=86400
Endpoint group index
The full API is documented across three reference pages. Use this index to find the right group.
| Group | Service | Reference page |
|---|---|---|
| Pipelines, Connections, Quality | Metadata | API reference: Pipeline Engine (pipelines CRUD) / Metadata & Lineage |
| Catalog, Governance, GDPR, Admin | Metadata | API reference: Metadata & Lineage |
| Execution, Scheduler, Orchestrator | Pipeline Engine | API reference: Pipeline Engine |
| Git, Flink | Pipeline Engine | API reference: Pipeline Engine |
| Lineage graph, traversal, analytics, search, OpenLineage events | Lineage | API reference: Metadata & Lineage |
| Dashboard, Alerts, Definitions, Events, SSE, Channels | Monitor | API reference: Monitor & AI |
| Copilot, Migration, CDC | Proxied services | API reference: Monitor & AI |
Endpoint counts
| Service | Path operations (approx.) |
|---|---|
| Gateway (own) | 6 (plus many $ref re-exports) |
| Metadata | 38 |
| Pipeline Engine | 38 |
| Lineage | 14 |
| Monitor | 35 |
Proxied-only services (copilot, migration, CDC, connector-sdk) are surfaced through the gateway but are not formally specified beyond the six gateway stub operations.