Exam guide
Multiple choice - 1 correct response, 3 incorrect responses
Multiple response - 2 correct responses from 5 options
Design Resilient Architectures - 30% Design a multi-tier architecture solution Design highly available and/or fault-tolerant architectures Design decoupling mechanisms using AWS services Choose appropriate resilient storage Design High-Performing Architectures - 28% Identify elastic and scalable compute solutions for a workload Select high-performing and scalable storage solutions for a workload Select high-performing networking solutions for a workload Choose high-performing database solutions for a workload Design Secure Applications and Architectures - 24% Design secure access to AWS resources Design secure application tiers Select appropriate data security options Design Cost-Optimized Architectures - 18% Identify cost-effective storage solutions Identify cost-effective compute and database services Design cost-optimized network architectures Result is score from 100 - 1,000 and the minimum to pass is 720 . In total, there are 65 questions with 130 minutes to complete
Fundamentals
AWS consists of Regions and Availability Zones where:
Regions - Physical locations in the world, e.g., Frankfurt
Availability Zones (AZ) - Data Centers, i.e., buildings that fill in with servers.
1 region consists of 2 or more Availability Zones which are located far away from each other enough to be counted as different AZ
Edge Location - is another concept of locations of endpoints for AWS which is used for caching content.
There are uncountable services offered by AWS in which it increases from time to time. However, to pass this exam, knowing only the following services are sufficient.
Main services of AWS
How to process the information
EC2, Lambda, Elastic Beanstalk
How to save information
S3, EBS, EFS, FSx, Storage Gateway
How to store and retrieve information
RDS, DynamoDB, Redshift
How Compute, Storage, and Databases communicate with each other
VPC, Direct Connect, Route 53, API Gateway, AWS Global Accelerator
5 Pillars of well-architected framework
IAM
IAM = Identity Access Management → manage users and their level of access to the AWS Console
Create users and grant permissions to those users Control access to AWS resources Root account
Email address which is used to sign up for AWS → full admin access
Therefore, this account must be secured by:
Enable multi-factor authentication on the root account Create an admin group for your admins, and assign the appropriate permissions to this group Create user accounts for your admins Add your users to the admin group Permission Control using IAM
Permission is governed by Policy Document in JSON format to assign to Groups, Users, or Roles which is independent from regions
Example:
Usually, Policy Documents are not assigned specifically to Users as it will be hard to manage. Instead, create a group of users (even though this group consists of this specific user) and assign a Policy Document to this group is best-practice in this case.
Building Blocks
Users - a physical person Groups - functions → admin, dev, etc. Roles - internal usage within AWS Principals of least privilege - only assign a user the minimum amount of privileges they need to do their job
Tips
Access key ID and secret access keys are used for programmatic authentication Username and passwords are used for console login authentication [Access key ID and secret access keys] and [Username and passwords] are not the same Access key ID and secret access keys can be viewed only once. If lose, they have to be regenerated S3
Simple Storage Service (S3) - Object storage which is scalable and simple to use
Object storage → can store anything but cannot run OS or DB
Basics
S3 can store unlimited storage, but each object must be max. 5 TB
All objects are stored in folder-like objects called Bucket in which its name must be globally unique.
Bucket name format:
https://{bucket-name}.s3.{region}.amazonaws.com/{key-name}
Example:
https://acloudguru.s3.us-east-1.amazonaws.com/Raphie.jpg
When upload to S3 bucket, a HTTP 200 response will be return upon success.
S3 Object
composed of:
Version ID - store multiple version of the object Metadata - data about data (content-type , last-modified ) Access Control List (ACL) vs. Bucket Policy - ACL governs accessibility of each individual object, while the Bucket policy controls the access of all objects in the bucket. The ACL cannot overwrite Bucket Policy.
Versioning
Advantages:
All versions are stored in S3 even if the object is deleted The object is already backed-up Once versioning is enabled, it cannot be disabled Can be integrated to lifecycle rules Note that public access of the versioning does apply to only the latest version. Older versions requires individual setting for public access
To delete objects which are versioning, one needs to delete the object first to get its delete marker then delete the delete marker in order to completely remove that file including its versions from the bucket.
Storage classes
S3 Standard
High availability and Durability Data is stored redundantly across multiple devices in multiple facilities (>=3 AZs) Designed for Frequent Access Perfect for frequently accessed data Suitable for Most Workloads For websites, content distribution, mobile and gaming apps, big data analytics S3 Standard-Infrequent Access (Standard-IA)
Used for data that is accessed less frequently but requires rapid access when needed Low per-GB price but cost per-GB retrieval fee Great for long-term storage, backups, and as a data store for disaster recovery files S3 One Zone-Infrequent Access
Like Standard-IA but cost 20% less Data is stored redundantly within a single AZ S3 Glacier
retrieval time from 1 minute to 12 hours S3 Glacier Deep Archive
Default retrieval time is 12 hours S3 Intelligent Tiering
For data with unknow access pattern Automatically move the data to the most cost-effective tier based on how frequently it is accessed. With lifecycle management, one can automatically move objects between different storage tiers to save the cost. This can also apply to versioning, which means to archive old-version files to cheaper storage class. In fact, lifecycle management can apply to current version and previous version files.
S3 Object Lock
An object with Object Lock is stored in write once, read many (WORM) model to prevent object of being deleted or modified for a fixed amount of time (retention period) or indefinitely to add an additional layer of protection.
Objects and their versions cannot be overwritten or deleted, as well as cannot alter its lock setting except that user has special permissions.
Objects and their versions cannot be overwritten or deleted, as well as cannot alter its lock setting regardless of user’s permission level. Even the root user cannot touch objects in this mode until the retention period expires
Retention period vs Legal Holds
Both of these terms prevent object from being overwritten or deleted but:
Retention period - set as duration. can be changed in Governance Mode for users with permissions. Legal holds - remain affected until removed. can be placed and removed by any user who has s3:PutObjectLegalHold permission S3 Glacier Vault Lock
S3 Glacier Vault Lock allows you to deploy and enforce compliance controls for individual S3 Glacier vaults with a vault lock policy You can specify controls, like WORM, in a vault lock policy and lock the policy from the future edits. Once locked, the policy can no longer be changed. Encryption
Encryption in Transit - sending objects to/from bucket Encryption at Rest: Server-Side Encryption (SSE) - encryption at S3 SSE-S3: Keys are managed by S3, users do not need to worry anything SSE-KMS: Keys are managed by AWS Key Management Service SSE-C: Keys are managed by customer Encryption at Rest: Client-Side Encryption user encrypts files before uploading to S3 Server-side encryption can be enforced by:
Select encryption setting on the S3 bucket in the Console bucket policy can also apply in a way that the S3 will reject any PUT request to upload files without parameter x-amz-server-side-encryption in the request header Optimizing S3 Performance
By accessing S3 object, one would access by:
The components which are not bucket name bucket_name and file name file.txt are called prefixes. The more different prefixes, the faster performance during request can be done.
Upload → multipart upload increases uploading speed for files over 100 MB. This should be used to any file over 5 GB. Download → S3 byte-range fetch increases downloading speed. Can partially download to get only header of the file. S3 Replication
Replicate objects from bucket in one region to bucket in another region Objects in bucket are not replicated automatically. Should upload a new version in order to start to replicate Delete markers are not replicated by default.
EC2
Elastic Compute Cloud (EC2) - Secure, resizable compute cloud → Virtual Machine hosted in AWS
Pricing options
On-Demand - Pay by hour or second, depending on your need Flexible - Low cost and flexibility without upfront payment Short-term - Applications with short-term, spiky, or unpredictable workloads Testing the water - Applications being tested on EC2 for the first time Reserved - make a contract of 1 or 3 years to get up to 72% discount on the hourly charge Predictable usage - applications with steady state or predictable usage Specific capacity requirements - applications that require reserved capacity Pay up Front - save more when paying upfront Standard Reserved Instances - up to 72% off the on-demand price Convertible Reserved Instances - up to 54% off the on-demand price with an option to change to different instance type with equal or greater value Scheduled Reserved Instances - launch instances within predefined time window. Spot - purchase unused capacity at a discount price up to 90%. However, this price fluctuates with demand/supply Flexible - applications that have flexible start and end times Urgent capacity - users with an urgent need for large amounts of additional computing capacity Cost sensitive - applications that are only feasible at very low compute prices Dedicated - physical EC2 server for you. The most expensive one. If there is any question regarding licensing, go straight toward Dedicated option Compliance - Regulatory requirements that may not support multi-tenant virtualization On-Demand - can be purchased hourly on-demand Licensing - great for licensing that does not support multi-tenancy or cloud deployments Reserved - can be purchased as a reservation for up to 70% off the on-demand price Roles
identity that you can create in IAM that has specific permissions
similar to a user → AWS identity with permission policies → tell what can they do / cannot do. But can specify to a group of users.
roles can be assigned to user, AWS architecture, system-level accounts, or even cross-account access.
Security Groups
Usually computer communicates with each other using ports like:
Once an EC2 instance is created, a virtual firewall is generated to block everything. To be able to connect to the EC2, you need to open up the correct port using Security Groups. Or let everything in by 0.0.0.0/0 . Anyway, in production, do not forget to open only port 80 and 443 so that others will not be able to gain control of EC2 via SSH or RDP.
Bootstrap script
A script that runs once starting an instance. Usually it is a shell script with lines about installations, updates, and settings
Metadata
metadata = data about data → IP address, hostname, security groups, etc.
retrieve metadata from EC2
retrieve user data from EC2
Virtual Networking in EC2
ENI (Elastic Network Interface) - basic day-to-day networking create a management network use network and security appliances in your VPC private home network subnet EN (Enhanced Networking) - single root I/O virtualization → high performance (10 Gbps - 100 Gbps) higher bandwidth, higher packet per second (PPS), lower inter-instance latencies Composed of Elastic Network Adapter (ENA) and Intel 82599 Virtual Function (VF) interface - in any scenario question, always choose ENA over VF interface EFA (Elastic Fabric Adapter) - accelerate high performance computing and ML applications lower and more consistent latency and higher throughput than TCP transport when asking about high performance computing, what network interface should be used, go straight to EFA it uses OS-bypass to enable HPC to speed up with lower latency in ML application. When ask about OS-bypass, go straight to EFA Placement groups
Cluster Placement Groups - group of instances within a single AZ → for applications that need low network latency, high network throughput, or both Spread Placement Groups - group of instances that each hardware should be separated from each other → for applications that need to split hardware, e.g., for security, or appliance reasons Partition Placement Groups - group of instances that has its own network and power source → for isolation to reduce the impact of hardware failure Spot Instances
EC2 allows user to use the unused capacity in the cloud with up to 90% discount compared to on-demand price.
When to use?
The applications using Spot instances must be stateless, fault-tolerant, flexible applications. For example:
containerized workloads (CI/CD) high-performance computing Spot instances are not good for:
Prices
The Spot Instances will be always available as long as the spot price is lower than your predefined maximum price. The price varies across regions and AZs. If the price exceeds the maximum price, you have 2 min to stop (and resume when the spot price is below your max price), or terminate the instance.
You can use Spot Block to stop your instances from being terminated if the price goes higher than the max price but the block can be set up to 6 hours currently.
Spot requests
To start using Spot instances, one need to define:
max price - maximum price number of instances - amount of instances to run launch spec - spec of machines, AMI, etc. request type {one-time | persistent} - just request for one time or keep requesting from when until when. For persistent request, it is critical to keep in mind that one must stop the persistent request first before terminating instances. Otherwise, the request will keep starting new instances after terminating ones infinitely.
Spot Fleet
A combination of Spot instances and On-demand instances which attempts to maintain target capacity with price restraints. There are multiple strategies as follows:
capacityOptimized The Spot instances come from the pool with optimal capacity for the number of instances launching diversified The Spot instances are distributed across all pools lowestPrice (default) The Spot instances come from the pool with the lowest price instancePoolsToUseCount The Spot instances are distributed across the number of Spot instance pools you specify. Can only be used in combination with lowestPrice Elastic Block Storage (EBS) and Elastic File System (EFS)
EBS
storage volumes you can attach to EC2 instances - store AMI, apps, database, etc. → virtual harddrive
EBS must be in the same AZ as the EC2. It can be resized or type changing on the fly without stopping or restarting the instance
EBS volume types
General Purpose SSD (gp2) balance of price per performance good for boot volumes or development and test applications that are not latency sensitive General Purpose SSD (gp3) new version of gp2 ... 4 times faster no need to select between gp2 and gp3, they are just boot device Provisioned IOPS SSD (io1) high throughput SSD (more than 16,000 IOPS) I/O-intensive apps, large databases, latency sensitive workloads, or apps that need high level of durability Provisioned IOPS SSD (io2) higher durability than io1 Throughput Optimized HDD (st1) frequently accessed, throughput intensive workloads for bid data, data warehouses, ETL, log processing for apps that do not need performance
IOPS vs Throughput
Volumes
virtual hard drive - you need at least 1 volume per EC2 instance → root device volume
snapshot → a point-in-time copy of an EBS volume, stored in S3 with only changes from the previous snapshots (no redundancy).
Encryption
volumes and snapshots can be encrypted using AWS Key Management Service (KMS) or customer master keys (CMK).
In case if the volume is unencrypted, one can encrypt it with the following steps:
create a snapshot of unencrypted volume create a copy of the snapshot and select the encrypt option create an AMI from encrypted snapshot use that AMI to launch new encrypted instances Behaviors
Start an EC2 instance
Stop instance
Terminate instance
Root device is terminated by default Hibernate instance
Save states from RAM to EBS (try root device first, if not work, then data volume) Start from hibernation
Root volume is restored from the previous state Boot up in a much faster fashion (RAM is already reloaded) Resume previously running processes Reattach data volumes with the same instance ID Cannot hibernate for longer than 60 days EFS
network file system that can be mounted to multiple EC2 instances (can also work with EC2 in multiple AZs)
compat with Linux-based AMI (not Windows) file system scales automatically pay per use (most expensive) Tiers
EFS has multiple tiers and life cycle management, to move data from one tier to another after x amount of days
Standard - for frequently accessed files Infrequently Accessed (IA) - for files not frequently accessed FSx
fully managed native Windows file system
AMI
Amazon Machine Image (AMI) provides the information required to launch an instance. There are two categories
Amazon EBS - instances being launched by a snapshot by Amazon Instance Store - instances being launched by a template stored in S3 (ephemeral storage). Cannot be stopped. Otherwise, all data will loss. !Databases
Relational Databases Service (RDS)
dataset that stores in tables which has rows and columns
databases that has RDS available
RDS is generally used for Online Transaction Processing (OLTP) NOT Online Analytical Processing (OLAP)
RDS can be deployed in multi-AZ to ensure high availability in case of failure in the primary AZ. This concept creates a replica of a RDS from one AZ to another AZ as a back up. However, one cannot directly access the secondary AZ directly as long as the primary one is still healthy.
Read replica is another concept of having a copy of RDS but the replica is read-only. This takes off the load from the primary RDS. Thus, ensuring high scalability. This concept applies on both multi AZs and cross-region.
Aurora
is a MySQL and PostgreSQL-compatible relational database engine that combines the speed and availbility of high-end commercial databases with the simplicity and cost-effectiveness of open-source databases.
5x performance than MySQL 3x performance than PostgreSQL DynamoDB
Nonrelation DB from Amazon
!VPC
VPC = virtual data center in the cloud → define our own network which includes IP adress range, subnets, route tables, and network gateways
CIDR
By setting a VPC, a set of IP addresses should be assigned to resources inside this VPC. CIDR plays a role to define how many IP addressed here.
Possible questions regarding this topic is - given an amount of resources, how should the range be defined. The IP addresses are given as xxx.xxx.xxx.xxx/n . In this case, n should be identified. A quick and dirty calculation is to find the smallest integer which satisfies the condition:
2^(32-n) >= amount of resources
For example, if 20 IPs are required, this leads to n=27. Thus, the IP range is xxx.xxx.xxx.xxx/27 .
Find more details in CIDR.xyz
Subnet
usually includes 3 tiers
web - public-facing network application - private subnet. speak only with web tier and database tier database - private subnet. speak only with application tier Keep in mind that 1 subnet is always in 1 AZ
Network Address Translation (NAT) Gateway
enable instances in a private subnet (that means if there is no private subnet, NAT is not required) to connect to the internet or other AWS services while preventing the internet from initiating a connection with those instances. It acts in background as an additional EC2 instance in our public subnet to route out connections from our private subnets. NAT gateways yields following properties:
Redundant inside the AZ (AWS prepares more than 1 EC2 to support this task for you in 1 AZ) Start at 5 Gbps and scale to 45 Gbps No need to patch → AWS does this job Not associate with security groups Auto assigned a public IP address Note: If you have resources in multiple AZs and they share a NAT gateway, in the event the NAT gateway’s AZ is down, resources in the other AZs lose internet access. → To create an AZ-independent architecture, create a NAT gateway in each AZ and config your routing to ensure resources use the NAT gateway in the same AZ.
Security Groups
Virtual firewalls for our resources in the subnet. By default, everything is blocked. To communicate with other channels, correct ports should be open. Port number to remember:
To let everything in: set 0.0.0.0/0
Network ACL
First line of defense (optional)
Control the inbound and outbound traffic. The default one allows all in- and outbound. The newly created custom ones will deny all traffic unless adding rules.
Route 53
Routing from Domain Name Server (DNS) to an IP address.
Routing Policies
Simple Routing Policy - have one record with multiple IP addresses. If multiple values in a record, all values will be returned to the user in a random order. Weighted Routing Policy - split traffic based on weights Failover Routing Policy - route traffic to active site, and use Health Monitor to detect failure. If fails, route traffic to passive site. Geolocation Routing Policy - route traffic based on user’s geographic location (where DNS queries originate) Geoproximity Routing Policy (Traffic Flow) - route traffic by combining user’s geographic location, resource availability, and latency to calculate bias before sending the user to our resources. Latency Routing Policy - route traffic based on lowest network latency for the user. Need a latency resource record set for the EC2 or ELB in each region that host the website Multivalue Answer Routing Policy - similar to simple routing but checking health of each resource first before sending traffic to healthy resources. ELB
Automatically distributes incoming application traffic across multiple targets. Can be done across multiple AZs.
ELB Types
Application Load Balancer - best suited for HTTP and HTTPS Network Load Balancer - capable of handling extreme performance load balancing Classic Load Balancer - no longer supported by AWS. If need to know user’s IP, need X-Forwarded-For feature Monitoring
CloudWatch is a monitoring and observability platform to identify potential issues
CloudWatch features:
System Metrics - metrics that get out of the box Application Metrics - more information inside EC2 instance, getting by install CloudWatch Alarms - alerts when something goes wrong, or stop instance Standard metrics is delivered every 5 min while detailed monitoring delivers data every 1 min
CloudWatch Logs
a tool to monitor, store, and access log files from variety of different sources. It can use SQL-like tools to query logs to find potential issues or data that is relevant. Important terms are:
Log Event - record of what happened. It contains a timestamp and the data Log Stream - a collection of Log Events from the same source Log Group - a collection of Log Streams If logs do not need to process afterward, forward them straight to S3. Otherwise, go to CloudWatch Logs.
If real-time logging service is required, choose Kinesis instead of CloudWatch Logs
High Availability and Scaling
Vertical scaling - increase the performance of the instance Hotizontal scaling - add more amount of instances at the current performance 3W of scaling
what sort of resource are we going to scale? how do we define the template? Where - Where do we scale? where does the model go? Should we scale out database or webserver? how do we know that we need more resources? CloudWatch alarms provides more data To scale out, one needs a Template, which is more or less like Configuration but:
Autoscaling Group
A collection of EC2 instances that scales according to a predetermined strategy with the following steps:
1) Define template
2) Networking and purchasing - don’t forget multiple AZs for high availability
3) ELB config - Autoscaling group follows a predefined load balancer health checks
4) Set scaling policy - select min, max, desired capacity
5) Notification - set SNS to let us know when the event happens
Scaling Types:
Reactive scaling - scale according to measured load Scheduled scaling - scale according to time Predictive scaling - scale according to AWS ML algo. Reevaluate every 24 hours to forecast the next 48 hours Relational Database Scaling
Vertical scaling - resizing the database from one size to another size Scaling storage - up size the storage, but once up, cannot go down Read replicas - create read-only copies of the data to spread out the workload Aurora serverless - AWS takes care of scaling, works really well on unpredictable workloads Nonrelational Database Scaling (DynamoDB)
Can set the Capacity mode in the DynamoDB instance
Provisioned - for predictable workload On-demand - pay-as-you-go Decoupling Workflows
Tightly coupled
Loosely coupled
If one service fails in tight coupled architecture, the whole service fails. Therefore, one should always consider loosely coupled architecture over the tightly coupled one.
Simple Queue Service (SQS)
Poll-bases messaging → an asynchronous messaging service that the writer put a message into a queue, when the reader is ready, they will come to fetch the message eventually.
SQS Settings
Delivery Delay - the delay for the reader to be able to read the message after the message arrived the queue. default is 0; can be set up to 15 min Message Size - message in any format up to 256 KB; can be set to have max. size to be smaller, but not bigger Encryption - messages are encrypted in transit by default, at-rest encryption can be set optionally Message Retention - how long does the message be alive in queue. default is 4 days; can be set between 1 min and 14 days Short (default) - connect to see if there is a message, then disconnect. Then come a gain to see message, and disconnect. This setting burns CPU drastically. Long (recommended) - connect to see if there is a message, then wait a bit before disconnect. Queue Depth - amount of messages in the queue; a trigger a autoscaling Visibility Timeout - time duration of which message being kepted (and hidden) in the queue after fetched. If the reader fails while processing the fetched message, the message will re-appear in the queue and allow other readers to fetch. Dead-Letter Queue (DLQ)
in case if there is a bad message in the queue, one reader fetches and gets error. So the message will re-appear after the Visibility Timeout. This death loop occurs until the Message Retention is reached. → solve by creating another SQS just to store Dead-Letter (DLQ)
It’s also important to setup a CloudWatch alarm to monitor queue depth
SQS Message Ordering
Simple Notification Service (SNS)
push-based messaging → a proactive messaging service that sends notification to subsctibed endpoints
SNS Settings
Subscribers - who will receive SNS messages Message Size - message can be up to 256 KB DLQ Support - Messages that fail to be delivered can be stored in SQS DLQ FIFO or Standard - FIFO only supports SQS as a subscriber Encryption - messages are encrypted in transit by default, at-rest encryption can be set optionally Access Policy - can add policy like in S3 API Gateway
a fully managed service that allow you to publish, create, maintain, monitor, and secure your API by acting as a “front door” of your applications
Big Data
3V of Big Data
Volume - ranges from TB to PB Variety - data from various sources and formats Velocity - data needs to be collected, stored, processed, and analyzed within a short perioud of time Redshift
a relational database which become a data warehouse service that can handle petabyte-scale data (but can be only in one AZ)
Redshift is not standard, dont use just to replace RDS
Elastic MapReduce (EMR)
a managed fleet of EC2 instances running open-source tools like Spark, Hive, HBase, Flink, Hudi, and Presto
Kinesis
a service to deal with real-time data. There are 2 versions of Kinesis:
When looking for a message broker
SQS - messaging broker that is simple to use and doesn’t require much config. But cannot deliver real-time data Kinesis - a bit more complicated to config but is eligible for real-time application Athena
a serverless SQL solution that allow user to query data directly from S3 without loading into a database
Glue
a serverless data integration service to perform ETL workloads without managing underlying servers
QuickSight
a BI data visualization service to create and share dashboards
Elasticsearch
Amazon solution to handle an open-source searching solution called Elasticsearch, works very well with logging (apart from CloudWatch logs). Always involve the Elasticsearch, Logstash, and Kibana (ELK) stack
Serverless Architecture
Serverless focus on only application code and let the provider manage the compute architecture for us.
Lambda
The simplest version of serverless computing that the user only:
select runtime - an environment of the code to run set permission - attach a role if the function needs an AWS API call set networking - setting VPC, subnet, security groups are also possible in Lambda but not required set resource - define the amount of CPU, RAM (128 MB - 10 GB), timeout (up to 15 min) the function will need set trigger - let the function knows when to initialize Container
a unit of software that packages up code and its dependencies so that it is flexible enough to run from one environment to another.
Terminologies
Dockerfile - text document that contains all commands or instructions that will be used to build an image Image - immutable file that contains the code, libs, dependencies, and config files needed to run an app Registry - docker image and distribution storage Container - running copy of the image that has been created Container Management System (ECS vs EKS)
AWS offers different services which are capable of managing multiple containers
Elastic Container Service (ECS)
The simplest choice. It manages everything on AWS. Can scale upto thousands of containers while keeping ease of use. Can also integrate to ELB.
Elastic Kubernetes Service (EKS)
Kubernetes is an open-source platform to manage containers. Can work on-prem and on cloud. Since not everything is on AWS, some configs might be required. If cross-cloud or hybrid settings are needed, this will be the only choice.
Fargate
compute engine that runs ECS or EKS. This choice will appear when selecting which compute engine will containers from ECS or EKS be hosted.
EventBridge (CloudWatch Events)
a serverless event bus that allows passing events from a source to an endpoint, or a glue for serverless applications.
Security
Distributed Denial of Service (DDoS) Attack
an attack to make our server fails to response to our real users.
Layer 4 attack - get thru TCP channel such as SYN floods or NTP amplification attacks Layer 7 attack - flood of GET/POST requests CloudTrail
a tool to log what’s happing in our AWS via API calls to S3 bucket but not RDP or SSH traffic
what to log:
metadata around API calls source IP address of the API caller response elements returned by the service AWS Shield → for Layer 4
a free high level protection against DDoS or Layer 3 and 4 attacks on ELB, Amazon CloudFront, and Route 53. By paying 3000$ more to activate AWS Shield Advanced, you will get a dedicated 24/7 DDoS Response Team to support and AWS bill protection once getting attacked due to high fees from ELB, CloudFront, and Route 53.
AWS Web Application Firewall (WAF) → for Layer 7
a monitoring tools for HTTP and HTTPS requests before sending to CloudFront or ELB. It can configure the conditions as follows:
All all requests except the ones specified Block all requests except the ones specified Count the requests that match the properties specified In general, WAF blocks Layer 7 attacks, e.g., DDoS, SQL injections, cross-site scripting.
GuardDuty
a threat monitoring and detection service with AI. That means, it will take 7-14 days to learn how does “normal” behavior looks like and take updated external database to consider malicious domains. Afterwards, if GuardDuty detects a threat from
, it will report via GuardDuty console and in CloudWatch Events which can further trigger Lambda function to address the threat.
Macie
a service that use AI to analyze your S3 and alert that you are storing Personal Identifiable Information (PII), Personal Health Information (PHI), or Financial Data.
Inspector
an automated security assessment service that helps to improve security and compliance of applications deployed on AWS. It scans two main area - EC2 instances and VPCs.
Type of assessments:
Network assessment - analyze network config to check if ports are reachable from outside the VPC → no need agent Host assessment - analyze vulnerable software, host hardening, and security best practices → need agent Key Management System (KMS)
an Amazon centralized key management service. Need Customer Master Key (CMK) to create and manage the master key. This approach relies on the fact that the Hardware Security Module (HSM) is hosted on AWS. The HSM plays a role like Trezor. With only KMS, it is like sharing Trezor with other AWS users. There is also CloudHSM service which allow us to have a standalone HSM. Again, CloudHSM does not allow key rotation.
Secrets Manager vs Parameter Store
Certificate Manager
a service to create, manage, and deploy public and private SSL certificates for use with other AWS services → ELB, CloudFront distributions, API Gateway. Therefore, there is no need to renew and update the certificates. Also, it’s free.
Automation
CloudFormation
A template in JSON or YAML form to deploy AWS service. It composes of 3 sections:
parameters - user-dependent questions to be filled when the template is run mappings - values that fill themselves based on different conditions like deploying on different regions resources - all resources to deploy Note: Hard-coded value and resource IDs, e.g., AMI can make the template fails
Elastic Beanstalk
a Platform-as-a-Service (PaaS). Just bring your code for webapp and Beanstalk will manage EC2 architecture for you, not serverless unfortunately.
Systems Manager
a set of tools to view, control, and automate AWS architecture and on-prem resources (need agent to install on the instance). Usually it is not called System Manager but its features, e.g., Automation Documents, Session Manager, or Parameter Store.
Caching
A solution to save some of the external content to somewhere near the resources to improve the overall performance
CloudFront
cache via edge locations. can select only by region, cannot select specific countries.
ElastiCache vs DynamoDB Accelerator (DAX)
ElastiCache - a combination of open-source solutions - Memcached, and Redis. It is designed for RDS solutions.
DAX - designed for caching DynamoDB
Global Accelerator
IP caching between users and ELB
!Governance
Organizations
a centralized tool to manage multiple AWS accounts. An important feature is Service Control Policies (SCP) to globally limit user’s permissions including the root account.
Deny statement in SCP means the users are not allowed to do the specified actions. Allow statement in SCP means the users are not allowed to do everything except the specified actions.
Every time there is a question asks about role, just answer role.
Migration
Move data to AWS
Normal internet → insecure, slow Direct Connect → secure, fast, but costly if used only for short period of time Physical → send HDD to AWS Snow Family
Physical HDD that AWS prepare to you for migration. Some packages might includes computational capability.
Storage Gateway
Hybrid cloud storage service to merge on-prem resources with the cloud. can be used for one-time or long-term.
DataSync
One-time migration tool. Need to install agent to the source resources in order to migrate. The target resource can be S3, EFS, or FSx.