Amazon Kinesis is a suite of services designed to facilitate real-time data streaming and processing. It provides a platform to collect, process, and analyze streaming data to generate insights quickly and react to new information. There are four main services within Kinesis:
Kinesis Video Streams
Features:
Purpose: Streams video from connected devices to AWS for analytics, machine learning, and other processing. Storage and Access: Durably stores, encrypts, and indexes video data streams. Default retention is 24 hours, configurable up to 7 days. Data Throughput: Each shard supports up to 5 transactions per second for reads, 2 MB per second read rate, and 1000 records per second for writes. Security: Supports encryption at rest using AWS Key Management Service (KMS). Use Cases:
Live streaming video analytics Machine learning model training with video data Real-time video monitoring and alerting Kinesis Data Streams
Features:
Purpose: Enables real-time processing of streaming big data. Data Throughput: Each shard ingests 1000 records per second, 1 MB/sec data input, and 2 MB/sec data output. Data Retention: Default retention of 24 hours, extendable to 7 days. Shards: Basic throughput units; streams can be resharded to adjust capacity. Security: Supports KMS for encryption. Replicates data synchronously across three AZs. Use Cases:
Log and Data Feed Intake: Collecting and processing log data in real time. Real-Time Metrics and Reporting: Generating metrics and reports from live data. Real-Time Analytics: Processing and analyzing data as it arrives. Complex Stream Processing: Performing complex transformations and computations on the data stream. Components:
Producers: Generate data and send it to Kinesis Data Streams. Consumers: Applications or services that process data from the stream. Records: Data units in a stream, consisting of a partition key, sequence number, and data blob. Shards: Units of capacity for data ingestion and retrieval. Resharding: Allows for splitting and merging of shards to manage throughput and costs. Kinesis Data Firehose
Features:
Purpose: Simplifies loading streaming data into data stores and analytics services. Automated Scaling: No need to manage shards, fully managed service. Data Transformation: Can invoke AWS Lambda to transform data before delivery. Data Delivery: Batches, compresses, encrypts, and delivers data to destinations such as S3, Redshift, Elasticsearch, and Splunk. Security: Supports KMS for encryption, synchronous replication across three AZs. Use Cases:
Real-Time Data Loading: Loading streaming data into S3, Redshift, Elasticsearch, or Splunk. Near Real-Time Analytics: Enabling near real-time analytics with BI tools and dashboards. Data Transformation: Transforming data on the fly using Lambda before delivery. Components:
Sources: Where the streaming data is generated. Delivery Streams: The core entities in Firehose that handle data transport. Records: Data units sent to Firehose. Destinations: Final storage locations for processed data. Kinesis Data Analytics