Operations

Deployment scenarios & sizing

DataFlow AI can be deployed in three different topologies, each with a different balance of cost, control, and time-to-launch. This page describes all three — On-Premises, Full GCP cloud, and the recommended Hybrid — with their infrastructure requirements, capacity sizing, cost tables, security posture, and a decision guide for choosing between them.


The three topologies at a glance

A "topology" here means where the platform runs and where the data lives. The deployment-scenarios source document defines three, and recommends Hybrid for Polkomtel Plus.

TopologyWhere it runs3-Year TCO (USD)Annual costTime to first production pipelineFull production
1. On-PremisesPolkomtel's own Warsaw data center$1,836,000 (≈2,286,000 PLN incl. CapEx)$582K OpEx + $540K CapEx in Year 014–24 weeks6–9 months
2. GCP Full CloudGCP europe-central2 (Warsaw), DR in europe-west3 (Frankfurt)$495,000 (realistic)$165,000 (realistic)3–4 weeks2–3 months
3. Hybrid ⭐ RecommendedSource databases stay on-prem; the platform runs on GCP$606,000–$741,000$197,000–$242,00010–14 weeks4–6 months

A few terms used throughout this page:

  • CapEx (capital expenditure) — a large upfront purchase, e.g. buying servers.
  • OpEx (operating expenditure) — a recurring cost, e.g. a monthly cloud bill.
  • TCO (total cost of ownership) — the all-in cost over a defined period, here three years.
  • CDC (change data capture) — streaming each database change as it happens.
  • DR (disaster recovery) — a standby copy of the system in a second location.

Pricing basis

All cost figures use GCP list prices for Q1 2026, an exchange rate of 1 USD = 4.00 PLN, and Dell Q1 2026 Polish enterprise channel pricing. Polish VAT (23%) is not included in any cost table. GCP always bills in USD, which introduces foreign-exchange risk for a PLN-denominated budget.


Scenario 1 — On-Premises

Every DataFlow AI component runs on hardware Polkomtel owns and operates in its own Warsaw data center, on bare-metal Dell PowerEdge or VMware-virtualized servers, with Kubernetes provided by Red Hat OpenShift. No data ever leaves the corporate network — this is the maximum-data-sovereignty option.

Headline numbers: 2,160,000 PLN CapEx · $582K annual OpEx · $1.836M 3-Year TCO · up to 24 weeks to launch · 13 Kubernetes nodes (10 workers + 3 control plane) · 100% data sovereignty.

Infrastructure — hardware to buy (CapEx)

The On-Premises topology requires a one-time hardware purchase totalling 2,160,000 PLN (≈$540,000).

ComponentSpecificationCountUnit (PLN)Total (PLN)
Kubernetes worker nodesDell PowerEdge R750 · 2× Xeon Silver 4316 · 256 GB ECC · 2× 1.92 TB NVMe · 25 GbE1065,000650,000
Kubernetes control planeDell PowerEdge R650 · 2× Xeon Silver 4310 · 128 GB · 2× 960 GB NVMe335,000105,000
PostgreSQL HA serversDell PowerEdge R750xs · 2× Xeon Gold 5318Y · 512 GB · 8× 3.84 TB NVMe RAID10295,000190,000
Redis cluster nodesDell PowerEdge R650 · Xeon Silver 4310 · 128 GB · 4× 960 GB NVMe330,00090,000
Kafka broker nodesDell PowerEdge R750 · Xeon Silver 4316 · 128 GB · 6× 3.84 TB NVMe · KRaft555,000275,000
Storage array (NAS)NetApp AFF A400 · 200 TB raw (50 TB usable after RAID + replication) · 100 GbE1280,000280,000
Load balancersF5 BIG-IP i2800 · HA pair · WAF module · SSL offload260,000120,000
Top-of-rack switchesArista 7050X3 · 32× 100 GbE445,000180,000
Spine switchesArista 7280R3 · 36× 400 GbE285,000170,000
UPS (N+1)APC Symmetra LX 40 kVA · 15-min battery at full load250,000100,000
Total CapEx2,160,000 PLN (≈$540,000)

Infrastructure — annual running cost (OpEx)

Running the on-prem estate costs 2,328,000 PLN/yr (≈$582,000/yr).

CategoryAnnual (PLN)Notes
Colocation (Warsaw DC)336,0006 racks, 30 kW, precision cooling, dual 10 Gbps uplinks
Power & cooling102,00030 kW average; cooling at PUE 1.4
Internet bandwidth108,00010 Gbps redundant fiber, 2 ISPs, BGP failover
Red Hat OpenShift Enterprise216,00013 nodes, Red Hat support, ACM + ACS security
Confluent Platform (Kafka)336,0005 broker enterprise license, Schema Registry, ksqlDB
HashiCorp Vault Enterprise72,0003-node HA, HSM seal, audit logging
Hardware maintenance (15% of CapEx/yr)324,000Dell ProSupport+
Backup & DR software60,000Veeam, immutable backups
Security (EDR/IDS/IPS)42,000CrowdStrike, Fortinet, Nessus
Infrastructure engineers (2 FTE)300,0002 dedicated senior engineers
Hardware amortization (5-year)432,0002.16M PLN ÷ 5 years
Total annual OpEx2,328,000 PLN/yr≈$582,000/yr

3-Year TCO and scaling

Year 0 CapEx is 2,160,000 PLN; Years 1–3 OpEx is 2,328,000 PLN each — a 3-Year total of 9,144,000 PLN (≈$2,286,000). There is no elastic scaling: the hardware is provisioned for peak load from day one. A hardware refresh is required at Year 5 (a further +2,160,000 PLN).

On-prem scaling is manual and slow. Worker nodes are added when average CPU exceeds 70% sustained for two weeks (order three R750 nodes — 8-week lead time). A sixth Kafka broker is added when a broker's partition-leader count exceeds 200 (55,000 PLN + 1 week to install). Power has headroom to 45 kW before the colocation footprint must expand.

When to choose On-Premises

  • Best for: maximum data sovereignty; air-gapped or high-security environments; organizations that already have a data center and a large infrastructure staff; regulatory environments that prohibit cloud entirely (military, government).
  • Not recommended for: fast time-to-launch (under 3 months); elastic or unpredictable workloads; teams that want managed services; heavy AI/ML work that needs GCP's Vertex AI or BigQuery ML.

Scenario 2 — GCP Full Cloud

The entire platform runs on Google Cloud Platform in the europe-central2 (Warsaw, Poland) region, with disaster recovery in europe-west3 (Frankfurt, Germany). GKE Autopilot manages all containerized workloads, and the data services — Cloud SQL, Memorystore, Dataproc Serverless, Cloud Composer — are fully managed by Google.

Headline numbers: $0 CapEx · $13,750/month (realistic) · $495K 3-Year TCO (realistic) · 3–4 weeks to launch · 100% managed services.

Three sub-scenarios

GCP Full Cloud has three cost profiles depending on load. The middle one — Realistic — is the recommended baseline.

Sub-scenarioMonthlyAnnualProfile
Minimum$8,150$97,800Dev/test or lean production, ~50 active pipelines, no streaming CDC, no DR, single region
Realistic$13,750$165,000Full production, 200 active pipelines, moderate CDC streaming, full HA, warm DR standby
Pessimistic$23,200$278,400Peak load, heavy CDC, active-active DR in both regions, 40+ TB/day

The Minimum profile is explicitly not recommended for production Polkomtel workloads — it has no high availability and no disaster recovery.

GCP service cost breakdown (Realistic profile)

The Realistic profile's $13,750/month breaks down across twelve service categories. The figures below are for a deployment of 500+ pipelines, 15–25 TB/day, and 30–50 concurrent executions.

CategoryMinRealisticPessimistic
1. Compute — GKE Autopilot (platform services, pipeline engine, Flink, connectors, cluster fee)$1,873$3,406$6,639
2. Dataproc — Spark batch jobs + history server$729$2,652$6,924
3. Database — Cloud SQL PG15 HA (instance, SSD, backups, read replica)$320$754$1,741
4. Storage — Cloud Storage (Standard + Nearline + operations)$65$193$639
5. Messaging — Confluent Kafka + Pub/Sub$2,396$4,836$9,822
6. Caching — Memorystore Redis (primary HA + read replica)$143$716$1,432
7. Orchestration — Cloud Composer (Airflow)$282$565$1,130
8. Networking — VPN, egress, NAT, load balancer, DNS$302$590$1,959
9. Security — Cloud Armor, Cloud KMS, Secret Manager$32$121$445
10. Operations — Logging, Monitoring, Artifact Registry, Cloud Build$29$89$300
11. Disaster recovery (europe-west3)$20$872$4,270
12. Miscellaneous buffer (5–7%)$79$56$464
Total monthly$8,150$13,750$23,200
Total annual$97,800$165,000$278,400

Two cost levers dominate:

  • Confluent Cloud Kafka is the single largest line item ($4,752/month in the Realistic profile) and the most negotiable — an annual commitment typically yields a 30–50% discount.
  • Dataproc Serverless cost is the most variable. "Pushdown SQL" — running the transformation inside the source database (Teradata, Snowflake) instead of moving data to Spark — is the primary cost lever.

3-Year TCO (GCP Full Cloud)

ProfileYear 1Year 2Year 33-Year total
Minimum$97,800$97,800$97,800$293,400
Realistic$165,000$165,000$165,000$495,000
Realistic + 3-year CUD$155,000$134,400$134,400$423,800
Pessimistic$278,400$278,400$278,400$835,200

A CUD (Committed Use Discount) is a price reduction Google gives in exchange for committing to a 1- or 3-year usage level. Applying a full 3-year CUD plus Dataproc Spot instances can cut the realistic annual GCP cost from $165,000 down to roughly $104,076/yr — a 3-year saving of about $182,772. The catch: CUDs require an upfront commitment, and Google bills the committed amount monthly regardless of actual usage.

When to choose GCP Full Cloud

  • Pros: zero CapEx; elastic auto-scaling; fully managed data services (no database administration); Warsaw region keeps data GDPR-compliant; built-in DR in Frankfurt with under-60-second failover; access to Claude API, Vertex AI, and BigQuery ML; provisioning in 24–48 hours; automatic patching; fastest time-to-launch at 3–4 weeks.
  • Cons: ongoing spend with no "paid off" point; data-egress costs when results are sent back on-prem; internet dependency; billing spikes if workloads exceed estimates; moving 100 TB+ of Teradata data into GCP is expensive and risky; GCP vendor lock-in; foreign-exchange risk because Google bills in USD.

In the Hybrid topology, the source databases — Teradata, Oracle, SAP HANA, MSSQL — stay on-premises (they are already there, and moving them is expensive and risky), while the DataFlow AI platform itself runs on GCP europe-central2. The two halves are joined by a dedicated, private 10 Gbps Google Cloud Interconnect link with under-5-millisecond latency.

Headline numbers: $0 new CapEx · $12,740/month GCP spend · $242K total annual · $741K 3-Year TCO · 10–14 weeks to launch · under-5 ms Interconnect latency.

What stays on-prem and why

ComponentWhy it stays on-premNew cost
Teradata Data WarehouseExisting investment, 100 TB+, migration risk; pushdown SQL runs near it$0 (existing)
Oracle ERP databaseBusiness-critical; on-prem regulatory policy; license tied to hardware$0 (existing)
SAP HANASAP licensing tied to on-prem servers$0 (existing)
Active DirectoryCorporate identity; federated to Keycloak on GCP via LDAP$0 (existing)
Debezium CDC agents (4 VMs)Co-located with the source databases for low-latency change capture$0 (existing VMware capacity)
PII masking agents (2 VMs)Mask PESEL and other personal data before it crosses to GCP$0 (existing VMware capacity)
On-prem Kafka buffer (3 brokers)Absorbs CDC bursts; retains events if the Interconnect link drops180,000 PLN/yr (or $0 if existing servers are reused)
Cloud Interconnect on-prem terminationCisco ASR edge router, Warsaw cross-connect7,200 PLN/month

A key compliance feature: personal data is masked on-prem at the Debezium layer before it ever crosses to GCP. PESEL, NIP, REGON, phone numbers, and email addresses are pseudonymized with a deterministic SHA-256 hash. The AI Copilot only ever receives schema context — never raw billing or customer data.

Hybrid cost breakdown

The GCP-side platform cost in the Hybrid topology is lower than full-cloud — about 7% lower — because the CDC agents and their Kafka buffering run on-prem, reducing GKE, Kafka, and storage spend on GCP.

Cost componentAnnual
On-prem incremental (Interconnect, Kafka buffer, 0.5 FTE network engineer)~$89,100/yr (or ~$46K with existing Kafka)
GCP platform ($12,740/month)$152,880/yr
Combined Hybrid annual$241,980/yr (≈$242,000)
Combined — reusing existing on-prem Kafka≈$197,000/yr

3-Year TCO (Hybrid)

VariantYear 1Year 2Year 33-Year total
Hybrid (new Kafka hardware)$257,000 (incl. +$15K Interconnect setup)$242,000$242,000$741,000
Hybrid (existing on-prem Kafka)$212,000$197,000$197,000$606,000

There is a one-time ~$15,000 physical cross-connect installation at the Warsaw Interconnect point of presence, with a 6–8 week lead time for the physical circuit — this is the critical path for a Hybrid rollout.

When to choose Hybrid

  • Pros: keeps sensitive source databases on-prem; reuses existing infrastructure; GCP handles elastic compute, AI, and analytics; the 10 Gbps Interconnect is private, not over the internet; PII is masked on-prem before reaching GCP (RODO-compliant); a progressive migration path; lower GCP costs than full-cloud; a 3-year TCO of ~$606K versus $1.836M for On-Premises — a saving of $1.23M.
  • Cons: the most complex to set up (two environments); a 6–8 week Interconnect lead time; needs network engineers for BGP/VPN/VLAN configuration; data residency is split (metadata in GCP, raw data on-prem); partial dependency on Interconnect uptime — though the on-prem Kafka buffer holds events locally for 48 hours.

Why Hybrid is recommended for Polkomtel

Polkomtel already owns Teradata, Oracle, and SAP HANA on-prem — a sunk cost best kept in place to avoid migration risk. Hybrid costs $197K–$242K/yr versus $165K/yr for GCP-only — a $32K–77K/yr premium that buys data sovereignty. Over three years, Hybrid ($741K) versus On-Premises ($2,286K) saves $1.55M — a return on investment above 200%, and Hybrid delivers 73% cost savings versus On-Premises. It launches in 10–14 weeks rather than 6–9 months, which matters for hitting the Informatica decommission deadline.


Capacity planning and growth

The three-year capacity plan assumes Year 1 launch, +30% growth in Year 2, and +50% in Year 3.

MetricYear 1Year 2Year 3On-Prem impactGCP / Hybrid impact
Active pipelines200260300Possible K8s node scale-out in Year 3GKE autoscales; +$600/mo in Year 3
Daily data volume processed500 GB800 GB1,200 GBNetApp headroom; monitor Kafka storageCloud Storage is unbounded; +$150/mo Dataproc in Year 3
CDC event throughput (peak)5,000 eps8,000 eps12,000 epsMay need a 6th Kafka broker in Year 3Confluent autoscales partitions
AI Copilot queries/day5001,5003,000Needs Anthropic API regardlessClaude API +$100/mo Year 2, +$250/mo Year 3
Concurrent users5080120K8s horizontal pod autoscaler handles itGKE autoscales
Lineage graph nodes50,000150,000500,000pgvector tuning needed at 500KCloud SQL auto-grows storage

The Year 3 cost increase is roughly +$130K CapEx and +$50K/yr OpEx for On-Premises, versus +$7,000/month (~$84K/yr) for GCP or Hybrid. On GCP and Hybrid, scaling is automatic — no node provisioning, storage auto-grows, and budget guardrails alert at 80%, 100%, and 130% of the configured budget.


Time-to-launch comparison

PhaseOn-PremisesGCP Full CloudHybrid
Infrastructure provisioning6–12 wks (hardware procurement)2–5 days (Terraform apply)6–8 wks (Cloud Interconnect physical circuit)
Platform deployment4–8 wks1–2 wks2–3 wks
Connectivity & security2–4 wks1 wk3–4 wks
First pipeline in production14–24 wks3–4 wks10–14 wks
Full production (all pipelines)6–9 months2–3 months4–6 months
Risk levelHigh (procurement, hardware failure)Low (managed, auto-recovery)Medium (network complexity)
Team requirement2+ FTE infra engineers dedicated0.5 FTE GCP admin0.5 FTE network + 0.25 FTE GCP admin

Security and RODO compliance per scenario

RODO is the Polish implementation of the GDPR. The table below shows how each topology meets the key compliance requirements (✓ Full / ⚠ Partial).

RequirementOn-PremGCP FullHybrid
RODO (Polish GDPR)✓ all data on-prem✓ GCP europe-central2 Warsaw✓ PII masked before GCP
UKE telecom regulations⚠ partial — verify with legal✓ CDR / raw telco data stays on-prem
SOC 2 Type II⚠ manual implementation + audit✓ inherits GCP certification✓ GCP covered, on-prem manual
ISO 27001⚠ manual audit✓ GCP certified⚠ partial
Encryption at rest✓ Vault HSM seal✓ Cloud KMS CMEK✓ both
Encryption in transit✓ internal mTLS✓ Google TLS 1.3✓ MACsec Interconnect + mTLS
Network segmentation (zero-trust)⚠ manual VLAN + firewall✓ VPC Service Controls✓ GCP VPC + on-prem VLAN
Audit logging (immutable)⚠ ELK + Wazuh manual SIEM✓ Cloud Audit Logs (400-day retention)✓ both
DDoS protection⚠ FortiGate IPS✓ Cloud Armor Enterprise✓ Cloud Armor
Data Loss Prevention⚠ manual policies✓ Cloud DLP + DataFlow governance✓ Debezium masks + Cloud DLP

All three topologies share the same DataFlow AI security features: Keycloak 24 for OIDC/SAML with Active Directory federation and MFA; HashiCorp Vault for dynamic 30-minute database credentials; API Gateway RBAC with 5 roles and 26 permissions; default-deny Kubernetes network policies; and 30-day rotation of all database passwords, API keys, and Kafka credentials.

For disaster recovery, the two source documents quote different targets — note the discrepancy:

  • The deployment-scenarios document quotes a warm GCP DR target of RTO < 60 seconds, RPO < 30 seconds.
  • The GCP cost-analysis document quotes RTO < 4 hours, RPO < 15 minutes for the same DR setup.

Decision guide — choosing a topology

The source document provides a decision flowchart. Walk through these questions in order:

Q1. Do regulations require ALL data to stay on-premises?
      YES  ->  On-Premises  (military / government air-gap)
      NO   ->  go to Q2

Q2. Are there large existing on-prem databases (Teradata / Oracle / SAP HANA)?
      YES  ->  go to Q3
      NO   ->  go to Q4

Q3. Is data transfer to the cloud acceptable, given PII masking on-prem first?
      YES  ->  HYBRID  (recommended)
      NO   ->  On-Premises

Q4. Is the budget under $200K/year?
      YES  ->  GCP Full Cloud (Minimum or Realistic)
      NO   ->  GCP Full Cloud (Realistic or Pessimistic)

The Polkomtel Plus path through this flowchart: Q1 = No (regulations permit cloud), Q2 = Yes (large Teradata, Oracle, and SAP HANA estates already on-prem), Q3 = Yes (transfer is acceptable with on-prem PII masking) → Hybrid.

Side-by-side summary

FactorOn-PremisesGCP Full CloudHybrid ⭐
Upfront CapEx2.16M PLN (~$540K)$0$0 new
3-Year TCO~$2,286,000~$495,000 (realistic)~$606,000–$741,000
Time to first pipeline14–24 weeks3–4 weeks10–14 weeks
Elastic scalingNo — fixed hardwareYes — fully automaticYes — GCP platform scales
Data sovereigntyMaximumGCP Warsaw regionSource data on-prem, metadata in GCP
Dedicated staff2+ FTE infra engineers0.5 FTE0.75 FTE
Best fitAir-gapped / regulatory ban on cloudGreenfield, budget-led, fast launchLarge existing on-prem databases

Two deployment realities

The topologies above describe the intended GKE-based GCP architecture. The platform's current live production deployment is a single Debian VPS running Docker Compose — not GKE, and not multi-region. For the build and rollout mechanics of what actually runs today, see Deployment & rollout.


Where to go next

  • For the business case, ROI, and migration economics behind choosing a topology, see Business value & ROI.
  • For the actual build, release, and rollout process — including the live VPS deployment — see Deployment & rollout.
  • For administrative tasks once a topology is live, see the Admin guide.
Previous
Deployment & rollout