Amazon Kinesis Data Firehose simplifies the process of loading streaming data into data stores and analytics tools. It captures, transforms, and loads streaming data in near real-time, enabling seamless integration with existing business intelligence tools and dashboards. Here's a breakdown of its features, components, and capabilities:
Key Features
Streamlined Data Loading:
Easiest way to ingest streaming data into data stores and analytics tools without the need to write custom applications or manage resources.
Simplifies the process of capturing, transforming, and loading streaming data.
Real-Time Analytics:
Enables near real-time analytics by delivering streaming data to business intelligence tools and dashboards for immediate insights and decision-making.
Source Integration:
Supports integration with Kinesis Data Streams as the data source, allowing seamless data transfer from Kinesis Streams to Firehose for further processing and delivery.
Data Transformation:
Allows you to configure Firehose to transform streaming data before delivering it to the destination.
Can perform operations such as batching, compression, and encryption of data to optimize delivery and ensure data security.
Automated Replication:
Synchronously replicates data across three Availability Zones (AZs) for fault tolerance and high availability.
Ensures that data is delivered reliably to destinations without loss or duplication.
Flexible Destinations:
Supports multiple destinations for delivering streaming data, including Amazon S3, Amazon Redshift, Amazon OpenSearch Service (Elasticsearch), and Splunk.
Allows you to choose the appropriate destination based on your data storage and analysis requirements.
Key Components
Source:
Where streaming data is continuously generated and captured, typically from Kinesis Data Streams or other streaming sources.
Delivery Stream:
The underlying entity of Amazon Kinesis Data Firehose, responsible for receiving, transforming, and delivering streaming data to destinations.
Represents the configuration and settings for data delivery, including transformations and destination configurations.
Record:
The unit of data sent to a delivery stream, representing the actual data payload.
Each record can have a maximum size of 1000 KB before Base64-encoding.
Destination:
The data store or analytics tool where streaming data is delivered for storage or analysis.
Supported destinations include Amazon S3, Amazon Redshift, Amazon Elasticsearch Service, and Splunk.
Integration and Security
Encryption: Supports encryption of data using an existing AWS Key Management Service (KMS) key for enhanced security.
Server-Side Encryption: Utilizes server-side encryption for data stored in Amazon S3 when Kinesis Streams is used as the data source.
Transformation: Can invoke AWS Lambda functions to transform incoming data before delivering it to destinations, providing flexibility in data processing.
For Amazon S3 destinations, streaming data is delivered to your S3 bucket. If data transformation is enabled, you can optionally back up source data to another Amazon S3 bucket:
For Amazon Redshift destinations, streaming data is delivered to your S3 bucket first. Kinesis Data Firehose then issues an Amazon Redshift COPY command to load data from your S3 bucket to your Amazon Redshift cluster. If data transformation is enabled, you can optionally back up source data to another Amazon S3 bucket:
For Amazon Elasticsearch destinations, streaming data is delivered to your Amazon ES cluster, and it can optionally be backed up to your S3 bucket concurrently:
For Splunk destinations, streaming data is delivered to Splunk, and it can optionally be backed up to your S3 bucket concurrently:
Want to print your doc? This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (