Merge pull request #5180 from platformsh/5178-autoscaling-improvements

ralt · web-flow · commit d87eb693dce8 · 2025-11-14T14:27:19.000+01:00
5178 autoscaling improvements
diff --git a/sites/upsun/src/manage-resources/autoscaling.md b/sites/upsun/src/manage-resources/autoscaling.md
@@ -1,22 +1,44 @@
 ---
 title: Autoscaling
-description: Learn how autoscaling dynamically adjusts app instances based on CPU usage to keep apps responsive under load while optimizing costs.
+description: Learn how autoscaling adjusts app instances based on CPU and memory usage to keep apps stable and cost-efficient under varying workloads.
 weight: -100
 keywords:
   - "resources"
   - "CPU"
   - "autoscaling"
   - "scaling"
 ---
+<!-- vale off -->
+Autoscaling is a feature that automatically adjusts how many instances of your application are running, increasing capacity when demand rises, and reducing it when things are quiet. It helps your app stay responsive under heavy load while keeping your infrastructure costs efficient.
 
-Autoscaling allows your applications to automatically [scale horizontally](/manage-resources/adjust-resources.html#horizontal-scaling) based on resource usage. 
+## What is autoscaling?
 
-This ensures your apps remain responsive under load while helping you optimize costs.  
+Autoscaling works through [horizontal scaling](/manage-resources/adjust-resources.html#horizontal-scaling), by adding or removing whole application instances depending on resource usage. If CPU or [memory](#memory-based-autoscaling) utilization stays above a certain threshold for a set time, {{% vendor/name %}} automatically adds more instances. If it stays low, {{% vendor/name %}} removes unneeded ones. You control these thresholds and limits, so scaling always happens safely and predictably.
 
 - **Scope:** Available for applications only  
 - **Product tiers:** Available for all Upsun Flex environments  
 - **Environments:** Configurable per environment - across development, staging, and production
 
+{{< note theme="info" title="Know your app first">}}
+Autoscaling is quick to set up: you can [enable it in a few clicks](#enable-autoscaling) from your environment’s **Configure resources** tab. However, it’s important to understand your app’s typical performance before turning it on. 
+
+Tools like [Blackfire](https://www.blackfire.io/) can help you identify where your app consumes CPU or memory, so you can set realistic thresholds that reflect your traffic patterns. Blackfire can also help you spot whether autoscaling is likely to benefit your app or if a fixed setup with tuned [vertical resources](/manage-resources/adjust-resources.html#vertical-scaling) like CPU/RAM would serve you better.
+{{< /note >}}
+
+## When to use autoscaling
+
+Autoscaling makes the most sense for workloads with variable or unpredictable traffic. It’s especially valuable when:
+
+- You run time-sensitive or customer-facing applications where latency matters.
+- Your app experiences seasonal or campaign-driven spikes.
+- You want to avoid paying for idle capacity during quieter periods.
+
+### Example: When autoscaling works effectively
+A retail app sees traffic jump fivefold every Friday evening and during holiday campaigns. By enabling autoscaling, the app automatically adds instances when CPU usage rises and scales back overnight, ensuring smooth checkouts without wasted cost.
+
+### Example: When autoscaling might not be needed
+An internal dashboard with predictable, low usage may not benefit from autoscaling. In this case, a fixed number of instances and tuned vertical resources (CPU/RAM) can be more cost-effective and stable.
+
 {{< note theme="info" title="Scale databases and resources">}}
 
 To vertically scale CPU, RAM, or disk, or horizontally scale applications and workers (manually), see:  
@@ -59,21 +81,30 @@ The tables below outline where autoscaling and manual scaling are supported, so
 | Trigger                   | Console     | 
 | ------------------------- | ----------- | 
 | Average CPU (min/max)     | Available   | 
-| Average Memory (min/max)  | Coming      | 
+| Average Memory (min/max)  | Available   | 
 
-  
 ## How autoscaling works
 
 ### Thresholds
 
-Autoscaling continuously monitors the average CPU utilization across your app's running instances. It works by you setting your thresholds, which are specific CPU usage levels that determine when autoscaling should take action. There are two different thresholds that your CPU utilization operates within: A scale-up threshold and a scale-down threshold.
+Autoscaling monitors the average CPU and [memory usage](#memory-based-autoscaling) of your running app instances.  
+You define thresholds that determine when new instances are launched or removed.
+
+There are two different thresholds that your CPU and memory utilization operate within: A scale-up threshold and a scale-down threshold.
 
 - **Scale-up threshold**: If your chosen trigger (e.g. CPU usage) stays **above** this level for the time period you've set (the evaluation period), autoscaling will launch additional instances to share the load.
 
 - **Scale-down threshold**: If your chosen trigger stays **below** this level for the time period you've set, autoscaling will remove unneeded instances to save resources and costs.
 
 To prevent unnecessary back-and-forth, autoscaling also uses a cooldown window: a short waiting period before another scaling action can be triggered. This can also be configured or kept to the [default](#default-settings) waiting period before any additional scaling starts. 
 
+{{< note theme="warning" title="Combined triggers" >}}
+
+If both CPU and memory triggers are enabled, either one can initiate scaling. A global cooldown applies after each scaling event, but in rare cases, combined triggers may interact unexpectedly. For example, CPU scaling up followed by memory scaling down. Adjust thresholds and cooldowns carefully to avoid oscillation.
+
+{{< /note >}}
+
+
 ### Default settings
 
 Autoscaling continuously monitors the configured **trigger** across your app’s running instances. We will use the **average CPU utilization** trigger as the primary example for the default settings and examples below.
@@ -103,6 +134,46 @@ Autoscaling continuously monitors the configured **trigger** across your app’s
 
 This cycle ensures your app automatically scales up during high demand and scales down when demand drops, helping balance performance with cost efficiency.
 
+## Memory-based autoscaling
+
+Autoscaling primarily relies on CPU utilization as its trigger, however you can also configure memory-based autoscaling, which works in a similar way, but with a few important differences to understand.
+
+### CPU-based triggers
+
+CPU-based autoscaling reacts to sustained changes in average CPU utilization.
+
+- Scale-up threshold: When average CPU usage stays above your defined limit for the evaluation period, instances are added to distribute the load.
+- Scale-down threshold: When CPU usage remains below your lower limit for the evaluation period, instances are removed to save resources.
+- Cooldown window: A delay (default: 5 minutes) before another scaling action can occur.
+
+### Memory-based triggers
+
+Memory-based autoscaling follows the same principle as CPU triggers but measures average memory utilization instead. When your app consistently uses more memory than your upper threshold, {{% vendor/name %}} adds instances; when memory usage remains low, it removes them.
+
+This option is useful for workloads where caching or in-memory data handling determine performance - for example, large data processing apps or services with persistent caching layers.
+
+#### Example
+
+| Condition | Scaling action |
+|------------|----------------|
+| Memory above 80% for 5 minutes | Scale up: Add one instance |
+| Memory below 30% for 5 minutes | Scale down: Remove one instance |
+
+{{< note theme="warning" title="Understand your app’s memory profile" >}}
+High memory usage doesn’t always mean your app needs more instances. Linux systems use available memory for caching and buffering, so 90–100% usage can be normal even under stable conditions. Before using memory-based autoscaling, profile your application’s typical memory behavior to avoid unnecessary scaling and extra cost.
+
+Tools such as [Blackfire](https://www.blackfire.io/) or system-level metrics in your [Application metrics dashboard](/increase-observability/application-metrics.html) can help you understand what “normal” looks like for your app.
+{{< /note >}}
+
+#### Configure memory triggers
+1. Open your project in the Console.  
+2. Select your target environment.  
+3. Choose **Configure resources**.  
+4. Under **Autoscaling**, select **Enable** (if not already enabled).  
+5. Choose **Memory usage (min/max)** as your scaling trigger.  
+6. Set scale-up and scale-down thresholds, evaluation period, and cooldown window.  
+7. Save changes — your app will now automatically scale based on memory utilization.
+
 ## Guardrails and evaluation
 
 Autoscaling gives you control over the minimum and maximum number of instances your app can run. These guardrails ensure your app never scales up or down too far. Set boundaries to keep scaling safe, predictable, and cost-efficient:
@@ -234,3 +305,4 @@ Scaling down to zero instances is also **not supported**. Use minimum instance c
 - [Payment FAQ](/administration/billing/payment-faq.html) 
 - [Monitor billing](/administration/billing/monitor-billing.html) 
 - [Pricing overview](https://www.upsun.com/pricing/)
+<!-- vale on -->