API reference

API reference: Monitor & AI

This page documents the Monitor service (openapi-monitor.yaml, port 8085) and the proxied AI Copilot, Migration Engine, and CDC connector services. The Monitor service serves under /api/v1 and requires a bearerAuth JWT on every endpoint.


Dashboard

Dashboard endpoints aggregate platform metrics, connector health, and system health.

MethodPathOperation IDPurposeAuth
GET/api/v1/monitor/dashboardgetDashboardAggregated dashboard metricsJWT
GET/api/v1/monitor/pipelines/{id}/metricsgetPipelineMetricsDetailed pipeline metrics (percentiles, SLA)JWT
POST/api/v1/monitor/pipelines/runrecordPipelineRunInternal — record a pipeline run resultJWT
GET/api/v1/monitor/connectors/healthgetConnectorsHealthConnector health statuses + summaryJWT
POST/api/v1/monitor/connectors/healthrecordConnectorHealthInternal — record a connector health checkJWT
GET/api/v1/monitor/system/healthgetSystemHealthSystem health (CPU/mem/disk/threads/services)JWT

DashboardMetrics includes totalPipelinesRun, successRate, avgDurationMs, rowsProcessed, activeAlerts, pipelinesByStatus, recentRuns[], topErrors[], connectorHealth[], timeSeriesData[]. PipelineMetricsDetail adds p50/p95/p99 durations and slaComplianceRate.

// GET /api/v1/monitor/dashboard  →  200 OK  (DashboardResponse)
{
  "totalPipelinesRun": 1284,
  "successRate": 0.973,
  "avgDurationMs": 184200,
  "rowsProcessed": 9821044,
  "activeAlerts": 3,
  "pipelinesByStatus": { "SUCCESS": 1249, "FAILED": 28, "RUNNING": 7 },
  "recentRuns": [],
  "topErrors": []
}
// GET /api/v1/monitor/pipelines/{id}/metrics  →  200 OK  (PipelineMetricsDetail)
{
  "pipelineId": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
  "p50DurationMs": 162000,
  "p95DurationMs": 240000,
  "p99DurationMs": 318000,
  "slaComplianceRate": 0.991
}

Note

recordPipelineRun and recordConnectorHealth are internal ingestion endpoints used by the platform itself. They are documented for completeness but are not intended for direct client use.


Alerts

Alert endpoints manage individual alert records, including bulk operations, history, and purge.

MethodPathOperation IDPurposeAuth
GET/api/v1/monitor/alertsgetAlertsList alertsJWT
POST/api/v1/monitor/alertscreateAlertCreate an alertJWT
GET/api/v1/monitor/alerts/{id}getAlertGet an alert by IDJWT
DELETE/api/v1/monitor/alerts/{id}deleteAlertDelete an alertJWT
POST/api/v1/monitor/alerts/{id}/ackacknowledgeAlertAcknowledge an alertJWT
POST/api/v1/monitor/alerts/{id}/resolveresolveAlertResolve an alertJWT
POST/api/v1/monitor/alerts/bulk/ackbulkAcknowledgeBulk acknowledge alertsJWT
POST/api/v1/monitor/alerts/bulk/resolvebulkResolveBulk resolve alertsJWT
GET/api/v1/monitor/alerts/historygetAlertHistoryAlert historyJWT
GET/api/v1/monitor/alerts/summarygetAlertSummaryAlert summary countsJWT
POST/api/v1/monitor/alerts/purgepurgeResolvedPurge old resolved alertsJWT

getAlerts accepts status, severity, category, and limit (default 100). getAlertHistory accepts category and sinceSeconds (default 86400). purgeResolved accepts olderThanSeconds (default 604800). Alert.severity: CRITICAL | WARNING | INFO; status: ACTIVE | ACKNOWLEDGED | RESOLVED; category: PIPELINE_FAILURE | PERFORMANCE_DEGRADATION | SECURITY_VIOLATION | DATA_QUALITY | SYSTEM_HEALTH | SLA_BREACH.

// POST /api/v1/monitor/alerts  →  201 Created
// Request: CreateAlertRequest
{
  "title": "Pipeline billing-daily-load failed",
  "severity": "CRITICAL",
  "category": "PIPELINE_FAILURE",
  "message": "Task 'load' failed after 3 retries"
}
// POST /api/v1/monitor/alerts/bulk/ack  →  200 OK
// Request: BulkAlertRequest
{ "alertIds": ["al-0001", "al-0002", "al-0003"] }
// GET /api/v1/monitor/alerts/summary  →  200 OK  (AlertSummaryResponse)
{
  "active": 3,
  "acknowledged": 5,
  "resolved": 142,
  "bySeverity": { "CRITICAL": 1, "WARNING": 2, "INFO": 0 }
}

Alert definitions

Alert definitions are reusable rules that the engine evaluates to generate alert events.

MethodPathOperation IDPurposeAuth
GET/api/v1/monitor/alerts/definitionsgetDefinitionsList alert definitionsJWT
POST/api/v1/monitor/alerts/definitionscreateDefinitionCreate an alert definitionJWT
GET/api/v1/monitor/alerts/definitions/{id}getDefinitionGet a definitionJWT
PUT/api/v1/monitor/alerts/definitions/{id}updateDefinitionUpdate a definitionJWT
DELETE/api/v1/monitor/alerts/definitions/{id}deleteDefinitionDelete a definitionJWT
POST/api/v1/monitor/alerts/definitions/{id}/toggletoggleDefinitionEnable/disable a definitionJWT
POST/api/v1/monitor/alerts/definitions/{id}/silencesilenceDefinitionTemporarily silence a definitionJWT

getDefinitions accepts type and enabled (bool). AlertDefinition.type: PIPELINE_FAILURE | SLA_BREACH | DATA_QUALITY | ROW_COUNT_ANOMALY | CONNECTION_DOWN | RESOURCE_THRESHOLD. Each definition carries an AlertCondition (metric, operatorGREATER_THAN | LESS_THAN | EQUAL | …, threshold, windowMinutes, consecutiveFailures) and recipients[] (NotificationRecipient with channelEMAIL | SLACK | PAGERDUTY | WEBHOOK).

// POST /api/v1/monitor/alerts/definitions  →  201 Created
// Request: AlertDefinition
{
  "name": "Billing SLA breach",
  "type": "SLA_BREACH",
  "enabled": true,
  "condition": {
    "metric": "duration_ms",
    "operator": "GREATER_THAN",
    "threshold": 300000,
    "windowMinutes": 60,
    "consecutiveFailures": 1
  },
  "recipients": [
    { "channel": "PAGERDUTY", "target": "routing-key-billing" }
  ]
}
// POST /api/v1/monitor/alerts/definitions/{id}/silence  →  200 OK
// Request: { "durationMinutes": 120 }
{
  "id": "def-0001",
  "name": "Billing SLA breach",
  "enabled": true,
  "silencedUntil": "2025-12-15T16:30:00Z"
}

Alert events

Alert events are individual firings generated by the engine when a definition's condition is met.

MethodPathOperation IDPurposeAuth
GET/api/v1/monitor/alerts/eventsgetEventsList engine-generated alert eventsJWT
GET/api/v1/monitor/alerts/events/{id}getEventGet an alert eventJWT
POST/api/v1/monitor/alerts/events/{id}/ackacknowledgeEventAcknowledge an eventJWT
POST/api/v1/monitor/alerts/events/{id}/resolveresolveEventResolve an eventJWT
POST/api/v1/monitor/alerts/events/{id}/silencesilenceEventSilence an eventJWT
GET/api/v1/monitor/alerts/events/historygetEventHistoryAlert event historyJWT

getEvents accepts status, severity, type. getEventHistory accepts definitionId (uuid), from/to (date-time), limit (default 100). acknowledgeEvent and resolveEvent take a {user} body; silenceEvent takes a {durationMinutes} body. AlertEvent.status: FIRING | ACKNOWLEDGED | RESOLVED; it carries metricValue, threshold, firedAt, ack/resolve metadata, and silencedUntil.

// GET /api/v1/monitor/alerts/events/{id}  →  200 OK  (AlertEvent)
{
  "id": "ev-0042",
  "definitionId": "def-0001",
  "status": "FIRING",
  "metricValue": 342000,
  "threshold": 300000,
  "firedAt": "2025-12-15T14:30:00Z",
  "silencedUntil": null
}
// POST /api/v1/monitor/alerts/events/{id}/ack  →  200 OK
// Request: { "user": "u-1001" }
{
  "id": "ev-0042",
  "status": "ACKNOWLEDGED",
  "acknowledgedBy": "u-1001"
}

SSE streams (real-time)

The Monitor service exposes Server-Sent Event streams for real-time alerts and metrics. They route through the gateway via the /api/v1/monitor/sse/** SSE-passthrough prefix.

MethodPathOperation IDPurposeAuth
GET/api/v1/monitor/sse/alertsstreamAlertsSSE stream of real-time alertsJWT
GET/api/v1/monitor/sse/metricsstreamMetricsSSE stream of periodic system metricsJWT

streamAlerts accepts a severity query param (comma-separated filter) and returns text/event-stream. Its event types are connected (initial confirmation carrying a subscriberId), alert (an alert JSON payload), and heartbeat (keepalive). The connection times out after 30 minutes.

// GET /api/v1/monitor/sse/alerts?severity=CRITICAL,WARNING
event: connected
data: {"subscriberId":"sub-7c2e","at":"2025-12-15T14:30:00Z"}

event: alert
data: {"id":"al-0099","severity":"CRITICAL","title":"Connector teradata-prod down"}

event: heartbeat
data: {}

Heads up

SSE connections are closed by the server after 30 minutes. Clients must reconnect — and may resume filtering on the same severity set — to keep receiving real-time alerts.


Notification channels

Notification channels deliver alerts to external destinations.

MethodPathOperation IDPurposeAuth
GET/api/v1/monitor/channelslistNotificationChannelsList notification channelsJWT
POST/api/v1/monitor/channelscreateNotificationChannelCreate a notification channelJWT
GET/api/v1/monitor/channels/{id}getNotificationChannelGet a channel by IDJWT
PUT/api/v1/monitor/channels/{id}updateNotificationChannelUpdate a channelJWT
DELETE/api/v1/monitor/channels/{id}deleteNotificationChannelDelete a channelJWT
POST/api/v1/monitor/channels/{id}/testtestNotificationChannelSend a test notificationJWT

listNotificationChannels accepts workspaceId (uuid), channelType, and enabled (bool). NotificationChannelDto.channelType: EMAIL | SLACK | PAGERDUTY | WEBHOOK | TEAMS. The config object is channel-specific — EMAIL → recipients, SLACK/TEAMS → webhookUrl, PAGERDUTY → routingKey, WEBHOOK → url.

// POST /api/v1/monitor/channels  →  201 Created
// Request: NotificationChannelDto
{
  "name": "Billing team Slack",
  "channelType": "SLACK",
  "enabled": true,
  "config": { "webhookUrl": "https://hooks.slack.com/services/T000/B000/XXXX" }
}
// POST /api/v1/monitor/channels/{id}/test  →  200 OK  (ChannelTestResult)
{
  "success": true,
  "message": "Test notification delivered",
  "testedAt": "2025-12-15T14:30:00Z"
}

AI Copilot

The Copilot service (port 8086) is proxied through the gateway. It is surfaced by a single gateway stub operation rather than a formal OpenAPI spec.

MethodPathOperation IDPurposeAuth
GET/api/v1/ai/copilotcopilotEndpointAI copilot (proxied to copilot-service)JWT
// GET /api/v1/ai/copilot  →  200 OK
{
  "service": "copilot",
  "status": "available"
}

Note

The Copilot, Migration, and CDC services are proxied-only — they are reachable through the gateway but are not formally specified beyond these gateway stub operations. The shapes shown here are illustrative; consult the service teams for the full request and response contracts.


Migration Engine

The Migration service (port 8087) is proxied through the gateway and exposes migration plans.

MethodPathOperation IDPurposeAuth
GET/api/v1/migration/plansmigrationPlansMigration plans (proxied to migration-service)JWT
// GET /api/v1/migration/plans  →  200 OK
{
  "plans": [
    { "id": "mig-0001", "name": "Teradata → Snowflake DWH migration", "status": "IN_PROGRESS" }
  ]
}

CDC connectors

CDC connector operations are proxied to the Connector SDK (port 8088).

MethodPathOperation IDPurposeAuth
GET/api/v1/cdc/connectorslistCdcConnectorsList CDC connectorsJWT
POST/api/v1/cdc/connectorsdeployCdcConnectorDeploy a CDC connectorJWT
GET/api/v1/cdc/healthcdcHealthCDC health summaryJWT

deployCdcConnector accepts a generic object body and returns 201 Created.

// POST /api/v1/cdc/connectors  →  201 Created
// Request: generic connector configuration object
{
  "name": "oracle-orders-cdc",
  "sourceType": "ORACLE",
  "tables": ["SALES.ORDERS", "SALES.ORDER_ITEMS"]
}
// GET /api/v1/cdc/health  →  200 OK
{
  "status": "UP",
  "connectors": 6,
  "running": 6,
  "failed": 0
}
Previous
Metadata & Lineage API