Kafka migration Project Plan : Blue-Green Deployment

Project Plan: Kafka migrate Using Blue-Green Deployment


HM has reviewed and recommends implementing the blue-green deployment strategy for the kafka migration and upgradation.

Migrating to WarpStream from Kafka means moving from a broker-heavy, disk-based architecture to a cloud-native, Kafka-compatible platform that leverages object storage for durability and stateless proxies for compute. This eliminates the complexity of managing brokers, ZooKeeper, and rebalancing, while providing virtually unlimited storage, lower costs, and easier scaling. Since WarpStream speaks the Kafka protocol, existing producers, consumers, and connectors work seamlessly, making migration straightforward while reducing operational overhead and improving resilience. HM has reviewed and recommends implementing the blue-green deployment strategy for the Kafka migration and upgrade—with the current Kafka cluster as blue (active production) and WarpStream as green (new environment)—to ensure smooth cutover and rollback capability. In parallel, Elastic Fleet is being introduced to centrally manage Elastic Agents from the Fleet Server (Kibana UI), making upgrades easier and data collection more streamlined. By rolling out changes at the policy level, all agents under that policy automatically adopt updates, reducing operational overhead and ensuring consistency across the environment. Using the Hm’s DN avoids the complexity to manage self signed SSL certificates as this DN is signed by authorized by certified vendor Digicert and its public CA root certificate is available on most of the servers.

Blue-Green Deployment Strategy diagram for Kafka migration


The "Blue-Green Deployment Strategy for kafka migrate " diagram illustrates the process of upgrading kafka by running two parallel environments—Blue (current) and Green (new).

Blue = current Kafka production cluster
Green = new WarpStream environment

bladerunner-kafka-migration - Page 1 (3).png
Architecture Diagram

bladerunner-kafka-migration - Page 2.png
Natting diagram


1. Project Overview

Project Title: kafka migration Using Blue-Green Deployment
Objective: Perform a seamless migration of Kafka to Warpstream using a Blue-Green deployment strategy, ensuring zero downtime and reducing risks.
Stakeholders:
HM Team: Platform Operations Team, DevOps Engineers, IT Security
BDO Team: Operations/Application Team, IT Security, Network Team

2. Project Milestones and Timeline

Milestones
Phase
Initial Planning
Kickoff meeting, stakeholder alignment, project setup
TBD
TBD 2
HM, BDO
Phase 1
Infrastructure Setup
Provision Green environment, configure VPC, security groups, load balancers
TBD
TBD
Phase 2
Blue-Green Deployment Setup
Set up green environments with load balancer, DNS routing, port configurations, elastic fleet server etc.
TBD
TBD
Phase 3
DNS/Network Changes Rollout
Add Green deployment DNS records, Nat DN to new IP’s, Allow port 9092/8220 for Outbound access
TBD
TBD
Phase 4
Pilot Rollout
Test agent installation in pilot systems, validate with Green environment
TBD
TBD
Phase 5
UAT Rollout
Gradual migration of UAT systems to Green, validation
TBD
TBD
Phase 6
Production Rollout
Migrate production agents gradually, monitor stability
TBD
TBD
Phase 7
Full Production Cutover
Complete migration, switch to Green, decommission Blue environment
TBD
TBD
Phase 8
Post-Cutover Monitoring
Monitor the environment for 1 week post-cutover
TBD
TBD
There are no rows in this table

3. Phased Approach

Phase 1: Planning and Infrastructure Setup

Objective: Prepare and provision the Green environment with the necessary infrastructure to support the upgraded kafka compatible warpstream deployment.

Key Tasks:
Task 1.1: Hold kickoff meeting with all stakeholders.
Task 1.2: Create project governance and communication plan.
Task 1.3: Provision VPC, subnets, security groups for the Green environment.
Task 1.4: Set up load balancers for both Blue and Green environments.
Task 1.5: Configure DNS entries and port opening for Green deployment (TCP 9092,9080 and 8220 ).
Owner: HM ​Duration: 1 Week ​Dependencies: Approval from stakeholders, infrastructure resources available.

Phase 2: Blue-Green Deployment Setup

Objective: Establish both environments (Blue and Green) and configure network traffic management.

Key Tasks:
Task 2.1: Create Warpstream deployment of recent stable version in the Green environment.
Task 2.2: Configure listeners, listener security protocol map in MSK to accept data from different BDO environments and infra.
Task 2.3: Create network Loadbalancer In Invicta AWS to route the data to warpstream in data managed ( dama ) AWS account
Task 2.4: Create DNS records
Task 2.5: Create SSL certificates
Task 2.6: Configure required target groups and listeners in order to route the data to respective listener
Task 2.7: Test security group configurations, ensuring proper inbound traffic flow.
Task 2.8: Conduct internal tests to ensure stability (log collection).
Owner: HM ​Duration: 1 Week

Phase 3: BDO Network changes Rollout

Object: Make required network changes in order to access Green environment from BDO infrastructure.
Key Tasks:
Task 3.1: Add domain names as listed below resolving to BDO’s internal IP’s.
stream-bladerunner.datamanaged.io
Task 3.2: Network changes in order to access 172.16.1.* IPS from BDO network
Task 3.3: Natting of internal IP’s to HM’s loadbalancer IP’s.
172.16.1.169
172.16.1.126
172.16.1.227
Task 3.4: Allow outbound access as per the list given below :
image.png
image.png
For Comparison of ports and Domain names
Owner: BDO ​Duration: 1-2 Weeks ​Dependencies: Green environment functional, internal approvals.

Phase 4: Pilot Rollout

Objective: Migrate a set of 10 kafka producer agents ( invicta filebeat/winlogbeat) to the Green environment to validate system stability.

Key Tasks:
Task 4.1: Select pilot systems (non-critical systems) for migration.
Task 4.2: Uninstall Invicta agents from pilot systems.
Task 4.3: Install new Elastic fleet agents pointing to Green deployment.
Task 4.4: Monitor agent connectivity and validate log ingestion.
Task 4.5: Validate system stability and alerting mechanisms.
Task 4.6: Validate Remote upgrade and config changes
Owner: BDO with support from HM ​Duration: 1-2 Weeks ​Dependencies: Green environment functional, internal approvals.

Phase 5: UAT Rollout

Objective: Gradually migrate UAT systems to the Green environment, while testing and validating performance.

Key Tasks:
Task 5.1: Coordinate with BDO teams to schedule migration of UAT systems.
Task 5.2: Perform agent uninstallation and installation (Green environment).
Task 5.3: Test log ingestion, alerting, and connectivity in UAT environment.
Task 5.4: Gather feedback and address issues before full production migration.
Owner: HM & BDO ​Duration: 2-3 Weeks ​Dependencies: Successful pilot phase, DNS configurations.

Phase 6: Production Rollout

Objective: Gradually migrate all production systems to the Green environment and continuously monitor performance.

Key Tasks:
Task 6.1: Begin migration of low-priority production systems.
Task 6.2: Monitor system stability, agent connectivity, and alerting.
Task 6.3: Gradually move critical systems and applications over.
Task 6.4: Address issues, if any, during migration.
Owner: HM (for support), BDO ​Duration: 3-4 Weeks ​Dependencies: Successful UAT migration, stakeholder approval.

Phase 7: Full Production Cutover and Decommissioning of Blue Environment

Objective: Finalize the migration, transition all traffic to the Green environment, and decommission the Blue environment.

Key Tasks:
Task 7.1: Complete migration of all agents.
Task 7.2: Switch load balancer and DNS entries to route all traffic to Green.
Task 7.3: Decommission Blue environment after final validation.
Task 7.4: Conduct post-cutover monitoring and troubleshooting.
Owner: HM and BDO ​Duration: 1 Week ​Dependencies: Full migration of agents, Green environment stability.

Phase 8: Post-Cutover Monitoring

Objective: Monitor the environment post-migration to ensure no critical issues arise.

Key Tasks:
Task 8.1: Monitor data collection
Task 8.2: Ensure agents communicate properly with Green environment.
Task 8.3: Address any post-migration issues.
Owner: HM and BDO ​Duration: 1 Week ​Dependencies: Full migration completed.

4. Communication Plan

Status
Kickoff Meeting
Once
Initial project discussion and alignment
HM, BDO
HM
Weekly Status Update
Weekly
Track project progress, discuss risks/issues
HM
UAT Phase Progress Meeting
As needed
Review UAT results and validate pilot performance
HM
Production Rollout Check-ins
Daily/As needed
Daily standups during the production migration phase
HM
Final Cutover and Review Meeting
Once (Post-Cutover)
Review cutover success, discuss any post-deployment actions
HM
There are no rows in this table

5. Risk Management

Risk
Risk
Impact
Likelihood
Mitigation
Connectivity issues during cutover
High
Medium
Perform rigorous testing during Pilot and UAT phases before full production cutover
Agent migration failures
Medium
Low
Validate agent installations thoroughly during the Pilot phase
Data loss during migration
High
Very Low
Use dual-running Blue and Green environments until all data is confirmed stable
There are no rows in this table

6. Rollback Plan

Triggers for Rollback:

Critical issues in the Green environment that cannot be resolved.
Persistent agent connectivity issues.
Major data loss during migration.

Rollback Procedure:

Disable DNS entries for the Green environment and revert networking configurations.
Revert load balancer settings to redirect traffic back to the Blue environment.
Uninstall Green agents and reinstall Blue agents on client systems.
Validate that all agents are communicating with the Blue environment and no data is lost.
Owner: HM and BDO ​Timeframe: 1-2 Days
Want to print your doc?
This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (
CtrlP
) instead.