Skip to content
Gallery
Envision
Share
Explore
Kubernetes

icon picker
Kubernetes Autoscaling

Initially 1 Job Running
Added a few more jobs
node scale down
Findings:
Job/Pod Logs are stored in ETCD (Control Plane) Managed by google so even if we delete all nodes, the logs will still stay.
The logs/records of job/pod will be deleted after ttl has passed, so their lifetime depends upon ttl not on phsycial nodes that the jobs ran on.
However, the events will still persist for some time.
Deleting the node still results in logs being emitted and can be seen on log explorer, even with ttl set to 1 The job completion log were still emitted. However, it possible that there was a delay in log emission and thus setting ttl to a few minutes could potentially solve the problem.
Deleting a node manually from GKE does not actually deletes the nodes. The node is still running and this can be viewed in Compute-Engine
Deleting the node from the cluster which is supposed to autoscale can have undeterminsitic results in GKE. as it was found the upcoming jobs would not result in TriggeredScaleUp Event causing the pod to wait in pending state for a long time.
Eventually the autoscaler will detect that the number of nodes have fallen below the min nodes number but this can take a considerable amount of time.
high-priority

If you manually deleted the only node in your GKE cluster using kubectl delete node , and you're observing and the node still appears in the Compute Engine section of GCP, there might be a discrepancy between the Kubernetes control plane's view of the cluster and the actual state of the Compute Engine resources.

It seems that manual interference with node management can disrupt the normal operation of the autoscaler and lead to temporary issues.

Deleting a Node Manually is not the same as Scaling Down by Autoscaler

1. Automation vs. Manual Intervention:
Autoscaler: Automatically scales down nodes based on resource utilization and predefined rules or metrics without human intervention.
Manual Deletion: Requires an operator to decide when to remove a node, which involves manual monitoring and decision-making.
2. Graceful Pod Eviction:
Autoscaler: Before scaling down, the autoscaler ensures that pods are safely and gracefully evicted from the node, respecting PodDisruptionBudgets and giving time for stateful applications to handle the termination properly.
Manual Deletion: While you can manually drain a node before deletion to mimic this behavior, it requires extra steps.
3. Cluster State Awareness:
Autoscaler: Takes into account the overall state of the cluster, including resource demands, scheduling constraints, and health status, to make informed decisions about scaling down.
Manual Deletion: An operator might not have full visibility into the cluster state or might overlook certain aspects that the autoscaler would consider, potentially leading to suboptimal decisions.
4. Workload Disruption:
Autoscaler: Minimizes disruptions by scaling down when it determines that the remaining nodes can handle the existing workload without performance degradation.
Manual Deletion: If not done with a full understanding of the workload distribution and resource requirements, can lead to overloading the remaining nodes, causing performance issues or outages.
Want to print your doc?
This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (
CtrlP
) instead.