Explore

Kafka migration Project Plan : Blue-Green Deployment

Project Plan: Kafka migrate Using Blue-Green Deployment

HM has reviewed and recommends implementing the blue-green deployment strategy for the kafka migration and upgradation.

Migrating to WarpStream from Kafka means moving from a broker-heavy, disk-based architecture to a cloud-native, Kafka-compatible platform that leverages object storage for durability and stateless proxies for compute. This eliminates the complexity of managing brokers, ZooKeeper, and rebalancing, while providing virtually unlimited storage, lower costs, and easier scaling. Since WarpStream speaks the Kafka protocol, existing producers, consumers, and connectors work seamlessly, making migration straightforward while reducing operational overhead and improving resilience. HM has reviewed and recommends implementing the blue-green deployment strategy for the Kafka migration and upgrade—with the current Kafka cluster as blue (active production) and WarpStream as green (new environment)—to ensure smooth cutover and rollback capability. In parallel, Elastic Fleet is being introduced to centrally manage Elastic Agents from the Fleet Server (Kibana UI), making upgrades easier and data collection more streamlined. By rolling out changes at the policy level, all agents under that policy automatically adopt updates, reducing operational overhead and ensuring consistency across the environment. Using the Hm’s DN

datamanaged.io⁠

avoids the complexity to manage self signed SSL certificates as this DN is signed by authorized by certified vendor Digicert and its public CA root certificate is available on most of the servers.

Blue-Green Deployment Strategy diagram for Kafka migration

The "Blue-Green Deployment Strategy for kafka migrate " diagram illustrates the process of upgrading kafka by running two parallel environments—Blue (current) and Green (new).

Blue = current Kafka production cluster

Green = new WarpStream environment

⁠

bladerunner-kafka-migration - Page 1 (3).png

⁠

Architecture Diagram

⁠

bladerunner-kafka-migration - Page 2.png

⁠

Natting diagram

1. Project Overview

Project Title: kafka migration Using Blue-Green Deployment

Objective: Perform a seamless migration of Kafka to Warpstream using a Blue-Green deployment strategy, ensuring zero downtime and reducing risks.

Stakeholders:

HM Team: Platform Operations Team, DevOps Engineers, IT Security

BDO Team: Operations/Application Team, IT Security, Network Team

⁠

2. Project Milestones and Timeline

Milestones

Milestones

Phase

Initial Planning

Kickoff meeting, stakeholder alignment, project setup

TBD

TBD 2

HM, BDO

Phase 1

Infrastructure Setup

Provision Green environment, configure VPC, security groups, load balancers

TBD

Phase 2

Blue-Green Deployment Setup

Set up green environments with load balancer, DNS routing, port configurations, elastic fleet server etc.

TBD

Phase 3

DNS/Network Changes Rollout

Add Green deployment DNS records, Nat DN to new IP’s, Allow port 9092/8220 for Outbound access

TBD

Phase 4

Pilot Rollout

Test agent installation in pilot systems, validate with Green environment

TBD

Phase 5

UAT Rollout

Gradual migration of UAT systems to Green, validation

TBD

Phase 6

Production Rollout

Migrate production agents gradually, monitor stability

TBD

Phase 7

Full Production Cutover

Complete migration, switch to Green, decommission Blue environment

TBD

Phase 8

Post-Cutover Monitoring

Monitor the environment for 1 week post-cutover

TBD

There are no rows in this table

⁠

3. Phased Approach

Phase 1: Planning and Infrastructure Setup

Objective: Prepare and provision the Green environment with the necessary infrastructure to support the upgraded kafka compatible warpstream deployment.

Key Tasks:

Task 1.1: Hold kickoff meeting with all stakeholders.

Task 1.2: Create project governance and communication plan.

Task 1.3: Provision VPC, subnets, security groups for the Green environment.

Task 1.4: Set up load balancers for both Blue and Green environments.

Task 1.5: Configure DNS entries and port opening for Green deployment (TCP 9092,9080 and 8220 ).

Owner: HM Duration: 1 Week Dependencies: Approval from stakeholders, infrastructure resources available.

⁠

Phase 2: Blue-Green Deployment Setup

Objective: Establish both environments (Blue and Green) and configure network traffic management.

Key Tasks:

Task 2.1: Create Warpstream deployment of recent stable version in the Green environment.

Task 2.2: Configure listeners, listener security protocol map in MSK to accept data from different BDO environments and infra.

Task 2.3: Create network Loadbalancer In Invicta AWS to route the data to warpstream in data managed ( dama ) AWS account

Task 2.4: Create DNS records

Task 2.5: Create SSL certificates

Task 2.6: Configure required target groups and listeners in order to route the data to respective listener

Task 2.7: Test security group configurations, ensuring proper inbound traffic flow.

Task 2.8: Conduct internal tests to ensure stability (log collection).

Owner: HM Duration: 1 Week

⁠

Phase 3: BDO Network changes Rollout

Object: Make required network changes in order to access Green environment from BDO infrastructure.

Key Tasks:

Task 3.1: Add domain names as listed below resolving to BDO’s internal IP’s.

stream-bladerunner.datamanaged.io

⁠

stream-bladerunner-mgmt.datamanaged.i⁠

⁠

Task 3.2: Network changes in order to access 172.16.1.* IPS from BDO network

Task 3.3: Natting of internal IP’s to HM’s loadbalancer IP’s.

172.16.1.169

172.16.1.126

172.16.1.227

Task 3.4: Allow outbound access as per the list given below :

⁠

For Comparison of ports and Domain names

Owner: BDO Duration: 1-2 Weeks Dependencies: Green environment functional, internal approvals.

Phase 4: Pilot Rollout

Objective: Migrate a set of 10 kafka producer agents ( invicta filebeat/winlogbeat) to the Green environment to validate system stability.

Key Tasks:

Task 4.1: Select pilot systems (non-critical systems) for migration.

Task 4.2: Uninstall Invicta agents from pilot systems.

Task 4.3: Install new Elastic fleet agents pointing to Green deployment.

Task 4.4: Monitor agent connectivity and validate log ingestion.

Task 4.5: Validate system stability and alerting mechanisms.

Task 4.6: Validate Remote upgrade and config changes

Owner: BDO with support from HM Duration: 1-2 Weeks Dependencies: Green environment functional, internal approvals.

⁠

Phase 5: UAT Rollout

Objective: Gradually migrate UAT systems to the Green environment, while testing and validating performance.

Key Tasks:

Task 5.1: Coordinate with BDO teams to schedule migration of UAT systems.

Task 5.2: Perform agent uninstallation and installation (Green environment).

Task 5.3: Test log ingestion, alerting, and connectivity in UAT environment.

Task 5.4: Gather feedback and address issues before full production migration.

Owner: HM & BDO Duration: 2-3 Weeks Dependencies: Successful pilot phase, DNS configurations.

⁠

Phase 6: Production Rollout

Objective: Gradually migrate all production systems to the Green environment and continuously monitor performance.

Key Tasks:

Task 6.1: Begin migration of low-priority production systems.

Task 6.2: Monitor system stability, agent connectivity, and alerting.

Task 6.3: Gradually move critical systems and applications over.

Task 6.4: Address issues, if any, during migration.

Owner: HM (for support), BDO Duration: 3-4 Weeks Dependencies: Successful UAT migration, stakeholder approval.

⁠

Phase 7: Full Production Cutover and Decommissioning of Blue Environment

Objective: Finalize the migration, transition all traffic to the Green environment, and decommission the Blue environment.

Key Tasks:

Task 7.1: Complete migration of all agents.

Task 7.2: Switch load balancer and DNS entries to route all traffic to Green.

Task 7.3: Decommission Blue environment after final validation.

Task 7.4: Conduct post-cutover monitoring and troubleshooting.

Owner: HM and BDO Duration: 1 Week Dependencies: Full migration of agents, Green environment stability.

⁠

Phase 8: Post-Cutover Monitoring

Objective: Monitor the environment post-migration to ensure no critical issues arise.

Key Tasks:

Task 8.1: Monitor data collection

Task 8.2: Ensure agents communicate properly with Green environment.

Task 8.3: Address any post-migration issues.

Owner: HM and BDO Duration: 1 Week Dependencies: Full migration completed.

⁠

4. Communication Plan

Status

Status

Kickoff Meeting

Once

Initial project discussion and alignment

HM, BDO

Weekly Status Update

Weekly

Track project progress, discuss risks/issues

UAT Phase Progress Meeting

As needed

Review UAT results and validate pilot performance

Production Rollout Check-ins

Daily/As needed

Daily standups during the production migration phase

Final Cutover and Review Meeting

Once (Post-Cutover)

Review cutover success, discuss any post-deployment actions

There are no rows in this table

⁠

5. Risk Management

Risk

Risk

Risk

Impact

Likelihood

Mitigation

Connectivity issues during cutover

High

Medium

Perform rigorous testing during Pilot and UAT phases before full production cutover

Agent migration failures

Medium

Low

Validate agent installations thoroughly during the Pilot phase

Data loss during migration

High

Very Low

Use dual-running Blue and Green environments until all data is confirmed stable

There are no rows in this table

⁠

6. Rollback Plan

Triggers for Rollback:

Critical issues in the Green environment that cannot be resolved.

Persistent agent connectivity issues.

Major data loss during migration.

Rollback Procedure:

Disable DNS entries for the Green environment and revert networking configurations.

Revert load balancer settings to redirect traffic back to the Blue environment.

Uninstall Green agents and reinstall Blue agents on client systems.

Validate that all agents are communicating with the Blue environment and no data is lost.

Owner: HM and BDO Timeframe: 1-2 Days

⁠

Want to print your doc?
This is not the way.

Try clicking the ⋯ next to your doc name or using a keyboard shortcut (

CtrlP

) instead.