BDO Kafka Migration


megaphone
HM needs to migrate its BDO Kafka cluster to Warpstream due to the end of support for its current Kubernetes infrastructure. The migration is complex due to custom Domain Name support recent being added in AWS MSK, It would require significant changes to the Invicta Agent, DNS, firewall, and NAT settings. This migration process involves deploying a new warpstream cluster, updating security groups, and redirecting NLB traffic to the new agents, ensuring data consistency and performance before decommissioning the old brokers. Post-migration, data sources in receive,search,store must be validated to ensure correct data ingestion.

Requirement :

HM managed Kafka migration to HM managed Warpstream.

Migration Plan: Transitioning Kafka from HM-Managed to Warpstream

Currently, we have a Kafka cluster managed by HM that we intended to migrate to Warpstream.

Key Challenges

Custom Domain Name (DN):
We use a multiple custom DN’s for our Kafka cluster.
streamline muliple custom DNs to be replaced with 1 custom DN across different BDO environments

Migration Complexity:
Migrating to Warpstream involves significant changes on both HM's and BDO’s ends.
The changes include:
Invicta Agent: Uninstallation of existing Invicta filebeat/winlogbeat agents and replace with Elastic fleet agents.
DNS Configuration: Updates to DNS settings managed by BDO.
Firewall Rules: Adjustments to firewall settings to accommodate the new setup.
NAT Settings: Changes to NAT configurations to ensure seamless network traffic flow.


Plan

Monitor AWS MSK's Support for Custom DNs:
Keep track of AWS's progress in supporting custom DNs.
Review and adopt the forthcoming documentation once released.
Prepare for Final Migration:
Plan and execute the necessary changes for migrating to Warpstream, including:
Uninstalling and installation of agents.
Modifying DNS settings.
Adjusting firewall and NAT configurations to support the new Warpstream environment.
Perform Final Migration to Warpstream:
Once Warpstream is fully ready, carry out the migration with minimal disruption to operations.
Ensure that all systems and applications are aligned with the new setup on warpstream.

Kafka Migration to Kubernetes Using Kafka MirrorMaker and Existing NLB Configuration

Migrating a Kafka cluster to Kubernetes involves ensuring seamless data transfer and minimal disruption to ongoing operations. This article outlines a comprehensive approach to migrate BDO Kafka cluster to Kubernetes by leveraging Kafka MirrorMaker for data synchronization and using the existing Network Load Balancer (NLB) configuration for a smooth transition. We’ll use the same listener and target groups while managing traffic redirection through security group updates.

Overview of the Migration Process

The migration process includes deploying a new Kafka cluster on Kubernetes, using Kafka MirrorMaker to synchronize data between the old and new clusters, and gradually shifting traffic to the new brokers. By retaining the existing NLB configuration and strategically managing security groups, we ensure minimal downtime and secure data handling.

Key Components Involved:

Kafka MirrorMaker: Used for real-time data replication from the old Kafka cluster to the new one.
Network Load Balancer (NLB): Manages and directs incoming traffic to Kafka brokers.
Listeners and Target Groups: Existing NLB configurations used for traffic routing.
Security Groups: Control access and secure communication between brokers and clients.

Step-by-Step Migration Process

1. Deploy the New Kafka Cluster on Kubernetes

Setup: Deploy the new Kafka brokers on Kubernetes. Ensure they are configured to handle the same topics and partitions as the old cluster.
Networking: Configure network settings to allow the new Kafka brokers to communicate with each other and with external clients.

2. Configure Kafka MirrorMaker

Deploy MirrorMaker: Set up Kafka MirrorMaker to run. This tool will mirror data from the old Kafka cluster to the new one on Kubernetes.
Mirror Configuration:
Source Cluster: Point to the old Kafka cluster.
Destination Cluster: Point to the new Kafka cluster.
Topics: Specify which topics to replicate. We can choose to replicate all topics or a subset.
bash
Copy code
kafka-mirror-maker.sh --consumer.config consumer.properties \
--producer.config producer.properties \
--whitelist=".*"

Run MirrorMaker: Start the Kafka MirrorMaker process to begin replicating data. Monitor the replication to ensure data consistency and correctness.

3. Monitor Data Synchronization

Data Consistency: Ensure that Kafka MirrorMaker is replicating data correctly by comparing topics, partitions, and offsets between the old and new clusters.
Performance: Check the performance of the new Kafka brokers to ensure they can handle the incoming load.

4. Update Security Groups

Identify Security Groups: Identify the security group currently applied to the old Kafka brokers. This group allows traffic from specific CIDRs and the NLB IP addresses.
Reassign Security Groups:
Remove: Detach the security group from the old Kafka brokers.
Add: Attach the same security group to the new Kafka brokers on Kubernetes.
This reallocation ensures that the new Kafka brokers are recognized as healthy by the NLB, while the old brokers are marked as unhealthy due to the absence of the security group.

5. Update Target Groups in the NLB

Add New Brokers: Add the IP addresses of the new Kafka brokers to the existing target groups in the NLB. This step integrates the new brokers into the current traffic flow managed by the NLB.
Health Checks: Verify that the NLB health checks pass for the new Kafka brokers, confirming they are ready to handle traffic.
Traffic Redirection: As the old Kafka brokers are marked unhealthy due to the security group removal, the NLB will automatically redirect traffic to the new brokers.

6. Validate New Cluster Operation

Monitor Traffic Flow: Ensure that the NLB is routing traffic to the new Kafka brokers and that producers and consumers are operating smoothly with the new cluster.
Functional Testing: Conduct comprehensive testing to confirm that all Kafka functionalities are working as expected in the new environment.

7. Complete the Migration

Decommission Old Brokers: Once the new Kafka brokers are stable and handling all traffic, remove the old brokers from the target groups and decommission them.
Finalize Security Configurations: Update or remove any security configurations related to the old Kafka brokers as necessary.

Diagram: Kafka Migration to Kubernetes Using Kafka MirrorMaker and Existing NLB Configuration

Below is a visual representation of the migration process:
plaintext
Copy code
+--------------------------------------------------+
| Producers |
+--------------------------------------------------+
| |
| |
v v
+--------------------------------------------------+
| Network Load Balancer (NLB) |
+--------------------------------------------------+
|
|
v
+-------------------------------------------+
| Listener/s |
+-------------------------------------------+
|
|
v
+------------------------+
| Target Group/s |
| (Existing and New |
| Kafka Brokers) |
+------------------------+
/ | \ / | \
/ | \ / | \
+------+ +------+ +------+ +------+ +------+ +------+
|Broker1| |Broker2| |Broker3| |Broker4| |Broker5| |Broker6|
+------+ +------+ +------+ +------+ +------+ +------+
^ ^ ^ ^ ^ ^
| | | | | |
| | | | | |
+-------------------------------------------------------------+
| Security Groups on Brokers |
| - Old Brokers lose security group (marked unhealthy) |
| - New Brokers gain security group (marked healthy) |
+-------------------------------------------------------------+
^ ^
| |
| |
+-------------------------------------------------------------+
| Kafka MirrorMaker |
| - Replicates topics, partitions, and metadata |
| - Synchronizes data between old and new brokers |
+-------------------------------------------------------------+

Kafka Migration Diagram.jpeg
Kafka Migration Diagram.pdf
72.4 kB

Explanation of the Diagram:

Producers and Consumers: Applications generating and consuming data connect to Kafka via the NLB.
Network Load Balancer (NLB): Manages traffic and routes it to Kafka brokers through a single listener.
Listener/s: A listener/listeners directs traffic to brokers in their respective target groups.
Target Group: Contains both old and new Kafka brokers during the migration.
Kafka Brokers:
Old Brokers: The current brokers handling traffic.
New Brokers: Deployed on Kubernetes, ready to take over traffic.
Security Groups:
Old brokers lose their security group, making them unhealthy in the NLB.
New brokers gain the security group, marking them healthy and eligible to handle traffic.
Kafka MirrorMaker: Facilitates real-time data replication between old and new brokers.

Validate Data sources in elastic/ S3.

After migrating the Kafka cluster, validate that all data sources are correctly ingesting data into ElasticSearch and S3 by checking index health, file counts, and timestamps. Monitor Kafka performance by observing broker health, topic activity, and consumer lag.


Technical Steps for Executing the Migration :


Setup Kubernetes Cluster and node groups
Create Kubernetes cluster for new kafka cluster and zookeeper instances ​
image.png
Create k8’s node groups for zookeeper and kafka with same specification as in previous cluster ​
image.png
Access nodes from CLI ​
image.png
Deploy Zookeeper and Kafka clusters
Deploy these k8’s manifest files of zookeeper and kafka as given in below screenshot: ​
image.png
Deploy Zookeeper first and then Kafka and ensure these are running :
image.png
Check logs of zookeeper and kafka to ensure these are running fine.
Note down the IP’s of all brokers with respect to their ID’s ​
image.png
Setup Kafka Mirror to sync from Old to new kafka on port 9093.
Setup JVM settings
image.png
Producer config ( produce to new kafka cluster ). Ensure 8093 is reachable from this node where you run kafkamirror job. Also, ensure 8093 is defined on the advertised listener on the new cluster. ​
image.png
Consumer Config ( consume from prod kafka ) ​
image.png
Start the kakfa mirror for all topics. Initiate it in background as it would take some time to sync the data and metadata ​
image.png
Kafka Mirror for a specific topic : ​
image.png
Add these IP’s in their target groups of the Network load balancer:
Identify the port for which New IP’s need to be defined :
image.png
Add all the IP’s in the bootstrap target group ​
image.png
Add broker IP’s to their respective target groups example kafka-0 IP goes to target group and likewise for other broker.
Repeat the above steps for all the ports advertised by Kafka configuration

The status of the registered targets will changes upon removal of Security groups assigned to EC2 instances of these brokers.
As shown in below snippet, the highlighted instances are the old kafka brokers in which security groups have been removed and the other ones are newly created where these SG’s are assigned to which makes it healthier as shown in above snippet :
image.png
Connect to one of the broker to check the topic details and volume usage increase : ​
image.png
image.png
image.png
Verify the topics are receiving the new data from endpoints, specify latest epoch time to check the latest data ingesting into the topic ​
image.png
Below screenshot shows latest data is ingesting.
image.png
Same needs to be validated from all other sources/topics using the above command and then ensure this data reaches properly to Elastic / Humio / S3.

Conclusion


The decision to temporarily migrate to a new HM-managed Kafka cluster allows us to maintain operational stability while preparing for a future transition to AWS MSK. This approach provides us with the necessary time to adapt to AWS MSK’s new features and ensures that we are not rushed into a complex migration process under tight deadlines. We are committed to completing this migration seamlessly and will continue to work closely with AWS and our teams to achieve this goal.
Considering all these factors,we have planned to perform the migration to another new deployed kafka cluster managed by HM until the MSK is ready to manage the custom DN.
The SSL certificate will expiry Next year June so we have got a year bandwidth to complete this migration to MSK.

Migrating the Kafka cluster to Kubernetes using Kafka MirrorMaker and maintaining the existing NLB configuration offers a streamlined approach with minimal disruption. By carefully managing security groups and leveraging real-time data replication, you can ensure a secure and efficient transition to a scalable and flexible Kafka deployment on Kubernetes.

Want to print your doc?
This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (
CtrlP
) instead.