Can be run simultaneously
:plus:
:plus:
:plus:
:plus:
:plus:
The sence
The Golden Pattern, we will compare every option to
Auto health monitoring, health checks and auto-repair. Pay-for-pod. No need to calculate resource limits.
Managed control & data plane
Cloud Run is a managed serverless compute platform that helps you run highly scalable containerized applications that can be invoked via web requests or Pub/Sub events.
Open standard Knative, google take management of control and data plane:
autoscaling, load balancing, health checks, auto-repair are managed.
Google Kubernetes Engine (GKE) is a managed Kubernetes service that facilitates the orchestration of containers via declarative configuration and automation.
GKE runs Certified Kubernetes, the control plane is mostly managed by Google
A platform to manage multiple clouds at one place
Pricing
Pay per node
Pay per pod
Pay-per-use
Pay per node (VM)
Included as part of Anthos.
Eliminating the idle cost
:minus: (current Azure costs are for 95% idle)
:plus: :minus: Pods have limits set on requests
:plus: (possible to always allocate CPU and memory)
:minus:
:minus:
Scaling
:plus: Cluster is autoscaled
:info: Pre-configured: Autopilot handles all the scaling and configuring of your nodes.
Default:
To configure Horizontal pod autoscaling (HPA)
To configure Vertical Pod autoscaling (VPA)
:info: Automatic
:plus: Optional:
Node auto-provisioning
To configure cluster autoscaling.
HPA
VPA
:plus: Optional:
Node auto-provisioning
To configure cluster autoscaling.
HPA
VPA
Secret Manager
:plus: :minus: Pay for use
:plus: :minus: Pay for use
:plus: Included
:plus: :minus: Pay for use
:plus: Included
Cloud Monitoring, Cloud Logging, Cloud Trace, and Error Reporting
:plus: Monitoring and logging:
L2:
Dynatrace
grafana, servicemonitors
Prometheus Operator
Splunk
L1:
Logging Layer - Fluentd
:plus: Pre-configured:
L1: System and workload logging
System monitoring
Optional:
L2: System and workload monitoring
:minus: :plus: Most external monitoring tools require access that is restricted. Solutions from several Google Cloud partners are available for use on Autopilot, however not all are supported, and custom monitoring tools cannot be installed on Autopilot clusters
:plus: Included, only GCP tools
:plus: Pay for use
Default:
System and workload logging
System monitoring
Optional: System-only logging
System and workload monitoring
:plus: Included, only GCP tools
Node configuration
upgrading, scaling, OS-related, SSH access, privileged pods
:plus:
IaC
kured
:minus: It’s the main feature
In order to troubleshoot Autopilot nodes a user should contact Cloud Customer Care to obtain a member name that is required to access the cluster.
:minus:
:plus:
:plus:
Sidecar (Istio, etc…)
:plus:
:minus: NOT yet, but already experimentally supported - https://cloud.google.com/service-mesh/docs/unified-install/managed-asmcli-experimental
:minus: MutatingWebhook Configuration
:minus: Linux “NET_ADMIN”
:minus:
:plus:
:plus:
Scale-to-zero container
:plus: Ability to install Knative
:plus: Ability to install Knative
:plus: Main feature!
:plus: Ability to install Knative
:plus:
Custom machine types
:minus:
:minus: Preemptible VMs
:minus: Limited CPU and Memory
:plus:
:plus: Standard or custom machine types on Anthos, including GPUs.
GPU, TPU
:minus: Currently
:minus:
:plus:
:plus:
Nodes per cluster
400. Possibility to lift this quota
15,000 for GKE versions 1.18 and later.
Pods per node
Up to 32
:plus: DaemonSet Pods
only 1
Up to 110
Up to 110
Containers per cluster
300,000
Up to 1,000 container instances by default, can be increased via a Quota increase.
300,000
300,000
Image type
Pre-configured: Container-Optimized OS with containerd
One of the following:
Container-Optimized OS with containerd
Container-Optimized OS with Docker
Ubuntu with containerd
Ubuntu with Docker
Windows Server LTSC
Windows Server SAC
Maximum memory size, in GB
Any
CPUs in Autopilot is available in 0.25 increments (0.01 for DaemonSets) and must be in the ratio of 1:1 to 1:6.5 with memory.
Autopilot replaces limits with given requests
16Gi max per container instance
Files written to the local filesystem count towards available memory and may cause container instance to go out-of-memory and crash.
Any
Storage
:plus:
:plus: only: "configMap", "csi", "downwardAPI", "emptyDir", "gcePersistentDisk", "hostPath",
"nfs", "persistentVolumeClaim", "projected", "secret"
:minus: No storage volumes or persistent disks.
:plus: Only GCP services
:plus:
:minus: "configMap" is in beta
Networking
:plus:
N1: Ingress controller on nginx with cert-manager and external-dns
N2: network security groups
:plus: Pre-configured:
VPC-native (alias IP)
Maximum 32 Pods per node
Intranode visibility
NodeLocalDNSCache
N1: HTTP load balancing
N2: Admission Controllersfor pod security policies
Default:
Public cluster
Default CIDR ranges
Network name/subnet
Optional:
Private cluster (must for us)
N1: Cloud NAT
(private clusters only)
Authorized networks (must for us)
Network policy (must for us)
Access to VPC / Compute Engine network via Serverless VPC Access. Services cannot be part of the Istio service mesh.
Optional:
VPC-native (alias IP)
Maximum 110 Pods per node
Intranode visibility
CIDR ranges and max cluster size
Network name/subnet
Private cluster
Cloud NAT
Network policy
Authorized networks
Access to VPC / Compute Engine network. Services participate in the Anthos Service Mesh.
DNS service discovery
:plus: NodeLocalDNSCache preconfigured
:minus: must use the full (*.run.app) URL
:plus: possible with third-party runsd
:plus: Optional
:plus: Optional
Supported protocols (only)
:plus:
:plus: HTTP/1, HTTP/2, WebSockets, gRCP, Pub/Sub push events
:minus: GraphQL, HTTP/2 Server Push
:plus:
:plus: HTTP/1, HTTP/2, WebSockets, gRCP, Pub/Sub push events
:minus: GraphQL
Request timeout
up to 60 minutes
Latency
https://blog.yongweilun.me/gke-ingress-is-slower-than-you-think
https://cloud.google.com/blog/products/networking/using-netperf-and-ping-to-measure-network-latency
The average latency is:
<.21s - 50%
< 1s - 95%
< 2s - 99%
Security
:plus:
S1: open-policy-agent/gatekeeper for pod security policies
S2: PrismaCloud
Pre-configured:
Workload Identity
Shielded nodes
Secure boot
Workload Identity
Filestore CSI driver
S1:Admission Controllersfor pod security policies
Optional:
Customer-managed encryption keys
(CMEK)
Application-layer secrets encryption
Google Groups for RBAC
:minus: NOT supported:
Binary authorization
Kubernetes Alpha APIs
Legacy authentication options
S2: Container Threat Detection
OPA Gatekeeper
Policy Controller
:plus: Managing access using IAM
Optional:
Workload Identity
Shielded nodes
Secure boot
Application-layer secrets encryption
Binary authorization
Customer-managed encryption keys
(CMEK)
Google Groups for RBAC
Compute Engine service account
Workload Identity
:plus: Same as GKE
URLs
letsencrypt
Custom domains only with manual SSL certificates.
Automatic service URLs and SSL certificates.
Custom domains only with manual SSL certificates.
Custom domains only with manual SSL certificates.
Upgrades, repair, and maintenance
Managed by us
Pre-configured:
Node auto-repair
Node auto-upgrade
Maintenance windows
Surge upgrades
Managed.
Optional:
Node auto-repair
Node auto-upgrade
Maintenance windows
Surge upgrades
Optional:
Node auto-repair
Node auto-upgrade
Maintenance windows
Surge upgrades
Zero downtime
Managed by us
Managed by us
:plus: Managed splittingtraffic between different revision
Managed by us
:plus: Managed splittingtraffic between different revision
Container isolation
Default Kubernetes container isolation.
k8s Admission Controllers and the seccomp profile is preconfigured.
Strict container isolation based on gVisor sandbox.
Default Kubernetes container isolation.
Default Kubernetes container isolation.
Create a Certificate Signing Request
:plus: cert-manager
:minus: Certificates are provided by Google, no support for 3rd party webhooks
https://cloud.google.com/kubernetes-engine/docs/how-to/managed-certs
:minus: Certificates are provided by Google
:plus:
:minus: Certificates are provided by Google
Execution environments
Fully managed on Google infrastructure.
GKE on Google Cloud
GKE on Anthos
Cluster add-ons
:plus:
CA1:
Service mesh:
Istio
linkerd
Pre-configured:
HTTP load balancing
Default:
Compute Engine persistent disk CSI Driver
NodeLocal DNSCache
:minus: NOT supported:
Calico
network policy
Cloud Build
Cloud Run
Cloud TPU
Config Connector
Graphics processing units (GPUs)
CA1:Istio on Google Kubernetes Engine
Kalm
Usage metering
Read more:
https://cloud.google.com/files/shifting-left-on-security.pdf
https://cloud.google.com/security/infrastructure/design
Only GCP services
Optional:
Compute Engine persistent disk CSI Driver
HTTP load balancing
NodeLocal DNSCache
Cloud Build
Cloud Run
Cloud TPU
Config Connector
Istio on Google Kubernetes Engine
Kalm
Usage metering
Same as GKE
Backup
Velero
https://dev.azure.com/hmcloud/MCS-Build/_git/mcs-platform-velero
Backup for GKE (preview)
Backup for GKE (preview)
Backup for GKE (preview)
SLA
Kubernetes API and node availability
Service Level Objective of at least 99.95%
Kubernetes API availability
Kubernetes API availability