Skip to content

icon picker
AWS Certified Solutions Architect - Associate (SAA-C02) - A Cloud Guru

Exam guide

Response Types
Multiple choice - 1 correct response, 3 incorrect responses
Multiple response - 2 correct responses from 5 options
Exam domains
Design Resilient Architectures - 30%
Design a multi-tier architecture solution
Design highly available and/or fault-tolerant architectures
Design decoupling mechanisms using AWS services
Choose appropriate resilient storage
Design High-Performing Architectures - 28%
Identify elastic and scalable compute solutions for a workload
Select high-performing and scalable storage solutions for a workload
Select high-performing networking solutions for a workload
Choose high-performing database solutions for a workload
Design Secure Applications and Architectures - 24%
Design secure access to AWS resources
Design secure application tiers
Select appropriate data security options
Design Cost-Optimized Architectures - 18%
Identify cost-effective storage solutions
Identify cost-effective compute and database services
Design cost-optimized network architectures
Tip
Result is score from 100 - 1,000 and the minimum to pass is 720 . In total, there are 65 questions with 130 minutes to complete

Fundamentals

AWS consists of Regions and Availability Zones where:
Regions - Physical locations in the world, e.g., Frankfurt
Availability Zones (AZ) - Data Centers, i.e., buildings that fill in with servers.
1 region consists of 2 or more Availability Zones which are located far away from each other enough to be counted as different AZ
Edge Location - is another concept of locations of endpoints for AWS which is used for caching content.
There are uncountable services offered by AWS in which it increases from time to time. However, to pass this exam, knowing only the following services are sufficient.
image.png

Main services of AWS

Compute
How to process the information
EC2, Lambda, Elastic Beanstalk
Storage
How to save information
S3, EBS, EFS, FSx, Storage Gateway
Databases
How to store and retrieve information
RDS, DynamoDB, Redshift
Networking
How Compute, Storage, and Databases communicate with each other
VPC, Direct Connect, Route 53, API Gateway, AWS Global Accelerator

5 Pillars of well-architected framework

Operational Excellence
Performance Efficiency
Security
Cost Optimization
Reliability

IAM

IAM = Identity Access Management → manage users and their level of access to the AWS Console
Create users and grant permissions to those users
Create groups and roles
Control access to AWS resources

Root account

Email address which is used to sign up for AWS → full admin access
Therefore, this account must be secured by:
Enable multi-factor authentication on the root account
Create an admin group for your admins, and assign the appropriate permissions to this group
Create user accounts for your admins
Add your users to the admin group

Permission Control using IAM

Permission is governed by Policy Document in JSON format to assign to Groups, Users, or Roles which is independent from regions
Example:
Usually, Policy Documents are not assigned specifically to Users as it will be hard to manage. Instead, create a group of users (even though this group consists of this specific user) and assign a Policy Document to this group is best-practice in this case.

Building Blocks

Users - a physical person
Groups - functions → admin, dev, etc.
Roles - internal usage within AWS
Principals of least privilege - only assign a user the minimum amount of privileges they need to do their job

Tips

Access key ID and secret access keys are used for programmatic authentication
Username and passwords are used for console login authentication
[Access key ID and secret access keys] and [Username and passwords] are not the same
Access key ID and secret access keys can be viewed only once. If lose, they have to be regenerated

S3

Simple Storage Service (S3) - Object storage which is scalable and simple to use
Object storage → can store anything but cannot run OS or DB

Basics

S3 can store unlimited storage, but each object must be max. 5 TB
All objects are stored in folder-like objects called Bucket in which its name must be globally unique.
Bucket name format:
https://{bucket-name}.s3.{region}.amazonaws.com/{key-name}
Example:
https://acloudguru.s3.us-east-1.amazonaws.com/Raphie.jpg
When upload to S3 bucket, a HTTP 200 response will be return upon success.

S3 Object

composed of:
Key - object name
Value - the data itself
Version ID - store multiple version of the object
Metadata - data about data (content-type , last-modified )
Access Control List (ACL) vs. Bucket Policy - ACL governs accessibility of each individual object, while the Bucket policy controls the access of all objects in the bucket. The ACL cannot overwrite Bucket Policy.

Versioning

Advantages:
All versions are stored in S3 even if the object is deleted
The object is already backed-up
Once versioning is enabled, it cannot be disabled
Can be integrated to lifecycle rules
Support MFA
Note that public access of the versioning does apply to only the latest version. Older versions requires individual setting for public access
To delete objects which are versioning, one needs to delete the object first to get its delete marker then delete the delete marker in order to completely remove that file including its versions from the bucket.

Storage classes

S3 Standard

High availability and Durability
Data is stored redundantly across multiple devices in multiple facilities (>=3 AZs)
Designed for Frequent Access
Perfect for frequently accessed data
Suitable for Most Workloads
Default storage class
For websites, content distribution, mobile and gaming apps, big data analytics

S3 Standard-Infrequent Access (Standard-IA)

Used for data that is accessed less frequently but requires rapid access when needed
Low per-GB price but cost per-GB retrieval fee
Great for long-term storage, backups, and as a data store for disaster recovery files

S3 One Zone-Infrequent Access

Like Standard-IA but cost 20% less
Data is stored redundantly within a single AZ

S3 Glacier

long-term data archiving
retrieval time from 1 minute to 12 hours

S3 Glacier Deep Archive

For rarely accessed data
Default retrieval time is 12 hours

S3 Intelligent Tiering

For data with unknow access pattern
Automatically move the data to the most cost-effective tier based on how frequently it is accessed.
image.png
With lifecycle management, one can automatically move objects between different storage tiers to save the cost. This can also apply to versioning, which means to archive old-version files to cheaper storage class. In fact, lifecycle management can apply to current version and previous version files.

S3 Object Lock

An object with Object Lock is stored in write once, read many (WORM) model to prevent object of being deleted or modified for a fixed amount of time (retention period) or indefinitely to add an additional layer of protection.
Governance Mode
Objects and their versions cannot be overwritten or deleted, as well as cannot alter its lock setting except that user has special permissions.
Compliance Model
Objects and their versions cannot be overwritten or deleted, as well as cannot alter its lock setting regardless of user’s permission level. Even the root user cannot touch objects in this mode until the retention period expires
Retention period vs Legal Holds
Both of these terms prevent object from being overwritten or deleted but:
Retention period - set as duration. can be changed in Governance Mode for users with permissions.
Legal holds - remain affected until removed. can be placed and removed by any user who has s3:PutObjectLegalHold permission

S3 Glacier Vault Lock

S3 Glacier Vault Lock allows you to deploy and enforce compliance controls for individual S3 Glacier vaults with a vault lock policy
You can specify controls, like WORM, in a vault lock policy and lock the policy from the future edits. Once locked, the policy can no longer be changed.

Encryption

Encryption in Transit - sending objects to/from bucket
SSL/TLS
HTTPS
Encryption at Rest: Server-Side Encryption (SSE) - encryption at S3
SSE-S3: Keys are managed by S3, users do not need to worry anything
SSE-KMS: Keys are managed by AWS Key Management Service
SSE-C: Keys are managed by customer
Encryption at Rest: Client-Side Encryption
user encrypts files before uploading to S3
Server-side encryption can be enforced by:
Select encryption setting on the S3 bucket in the Console
Use bucket policy
bucket policy can also apply in a way that the S3 will reject any PUT request to upload files without parameter x-amz-server-side-encryption in the request header

Optimizing S3 Performance

By accessing S3 object, one would access by:
The components which are not bucket name bucket_name and file name file.txt are called prefixes. The more different prefixes, the faster performance during request can be done.
Upload → multipart upload increases uploading speed for files over 100 MB. This should be used to any file over 5 GB.
Download → S3 byte-range fetch increases downloading speed. Can partially download to get only header of the file.

S3 Replication

Replicate objects from bucket in one region to bucket in another region
Objects in bucket are not replicated automatically. Should upload a new version in order to start to replicate
Delete markers are not replicated by default.

EC2

Elastic Compute Cloud (EC2) - Secure, resizable compute cloud → Virtual Machine hosted in AWS

Pricing options

On-Demand - Pay by hour or second, depending on your need
Flexible - Low cost and flexibility without upfront payment
Short-term - Applications with short-term, spiky, or unpredictable workloads
Testing the water - Applications being tested on EC2 for the first time
Reserved - make a contract of 1 or 3 years to get up to 72% discount on the hourly charge
Predictable usage - applications with steady state or predictable usage
Specific capacity requirements - applications that require reserved capacity
Pay up Front - save more when paying upfront
Standard Reserved Instances - up to 72% off the on-demand price
Convertible Reserved Instances - up to 54% off the on-demand price with an option to change to different instance type with equal or greater value
Scheduled Reserved Instances - launch instances within predefined time window.
Spot - purchase unused capacity at a discount price up to 90%. However, this price fluctuates with demand/supply
Flexible - applications that have flexible start and end times
Urgent capacity - users with an urgent need for large amounts of additional computing capacity
Cost sensitive - applications that are only feasible at very low compute prices
Dedicated - physical EC2 server for you. The most expensive one. If there is any question regarding licensing, go straight toward Dedicated option
Compliance - Regulatory requirements that may not support multi-tenant virtualization
On-Demand - can be purchased hourly on-demand
Licensing - great for licensing that does not support multi-tenancy or cloud deployments
Reserved - can be purchased as a reservation for up to 70% off the on-demand price

Roles

identity that you can create in IAM that has specific permissions
similar to a user → AWS identity with permission policies → tell what can they do / cannot do. But can specify to a group of users.
roles can be assigned to user, AWS architecture, system-level accounts, or even cross-account access.

Security Groups

Usually computer communicates with each other using ports like:
SSH - port 22
RDP - port 3389
HTTP - port 80
HTTPS - port 443
Once an EC2 instance is created, a virtual firewall is generated to block everything. To be able to connect to the EC2, you need to open up the correct port using Security Groups. Or let everything in by 0.0.0.0/0 . Anyway, in production, do not forget to open only port 80 and 443 so that others will not be able to gain control of EC2 via SSH or RDP.

Bootstrap script

A script that runs once starting an instance. Usually it is a shell script with lines about installations, updates, and settings

Metadata

metadata = data about data → IP address, hostname, security groups, etc.
retrieve metadata from EC2
retrieve user data from EC2

Virtual Networking in EC2

ENI (Elastic Network Interface) - basic day-to-day networking
create a management network
use network and security appliances in your VPC
private home network subnet
EN (Enhanced Networking) - single root I/O virtualization → high performance (10 Gbps - 100 Gbps)
higher bandwidth, higher packet per second (PPS), lower inter-instance latencies
Composed of Elastic Network Adapter (ENA) and Intel 82599 Virtual Function (VF) interface - in any scenario question, always choose ENA over VF interface
EFA (Elastic Fabric Adapter) - accelerate high performance computing and ML applications
lower and more consistent latency and higher throughput than TCP transport
when asking about high performance computing, what network interface should be used, go straight to EFA
it uses OS-bypass to enable HPC to speed up with lower latency in ML application. When ask about OS-bypass, go straight to EFA

Placement groups

Cluster Placement Groups - group of instances within a single AZ → for applications that need low network latency, high network throughput, or both
Spread Placement Groups - group of instances that each hardware should be separated from each other → for applications that need to split hardware, e.g., for security, or appliance reasons
Partition Placement Groups - group of instances that has its own network and power source → for isolation to reduce the impact of hardware failure

Spot Instances

EC2 allows user to use the unused capacity in the cloud with up to 90% discount compared to on-demand price.

When to use?

The applications using Spot instances must be stateless, fault-tolerant, flexible applications. For example:
big data
containerized workloads (CI/CD)
high-performance computing
test and dev workloads
Spot instances are not good for:
persistent workloads
critical jobs
databases
Want to print your doc?
This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (
CtrlP
) instead.