Horizontal pod autoscaling (HPA) scales out or in pods in deployment, replicaset, or stateful set based on CPU or memory metrics. In the example below, my-deployment will be scaled out up to 5 pods if the average CPU utilization exceeds 50%. The other way around works by scaling in pods down to 1, if the average CPU utilization hits below 50%.
kubectl autoscale deployment my-deployment --cpu-percent=50 --min=1 --max=5