What is autoscaling?

Autoscaling is a technique used in cloud computing to allocate resources. With autoscaling, the number of active servers or virtual machines varies automatically according to demand.

Autoscaling enables organizations to automate schedule-based and load-based scaling by managing resources on demand.  

Explore additional autoscaling topics: 

Why is autoscaling important?

Autoscaling offers several advantages to organizations of all sizes because it eliminates the need for manually adding or reducing instances to accommodate fluctuating demand. It enables organizations to achieve reliable performance at lower cost by automating tasks such as increasing or decreasing computing or memory resources and managing traffic spikes.

Before autoscaling, an organization’s CPU, memory, and network capacity were set and did not have the capacity to expand to meet higher demand, leaving resources unused in the case of over-provisioning. Autoscaling saves electricity costs and resources, allowing servers to be inactive in times of low load. This is useful for companies running their own web server infrastructure. 

Autoscaling also lowers cloud costs because most cloud providers use a pay-as-you-use system. It  allows you to prioritize workloads, allocating less sensitive workloads to machines available during low traffic times. Because autoscaling offers flexible resource allocation, companies with variable workloads can achieve more consistent uptime and availability.

Finally, some autoscaling tools also keep the environment constant by detecting and replacing unhealthy instances.

What is autoscaling used for?

Autoscaling is a core component of today’s cloud deployments. You can offload processing power to a new server automatically, according to set conditions determined by IT administrators. There are several components that you can autoscale. For example, central processing units (CPU), memory, or bandwidth.

This technology is also used to ensure service availability. For instance, an e-commerce site may set resources they assume will be enough to handle normal traffic. But if there is a surge in traffic, such as on Black Friday, the resources may not be enough, and the system may crash. Autoscaling accounts for those cases, ensuring the site is available to meet customer demand.

What are the main autoscale features?

Common features included in autoscaling solutions are:

  • Unified scaling:  You can configure automatic scaling for all scalable components from a single interface. 
  • Automatic resource discovery: This feature scans the environment and detects scalable cloud resources without the need to do so manually. 
  • Predictive scaling: Predicts future spikes in traffic and provisions the right resources accordingly.  
  • Schedule-based scaling: You can provision in advance, assigning the required machines on set dates and times. Add, edit, select, or delete schedules from the interface.
  • Load-based scaling: This feature allows you to define at what load you want the system to scale up or down.
  • Force log off: You can force lingering or inactive sessions to log off to achieve more cost savings.
  • Dynamic session timeouts: This feature allows machines to have different timeouts at different times of the day. For example, according to peak times.
  • Cost visualization and monitoring: From the autoscale console, you can monitor cost and consumption metrics, seeing in real time how much the optimization is saving you. 

What are the main autoscale Features?

Common features included in autoscaling solutions are:

  • Unified scaling:  You can configure automatic scaling for all scalable components from a single interface. 
  • Automatic resource discovery: This feature scans the environment and detects scalable cloud resources without the need to do so manually. 
  • Predictive scaling: Predicts future spikes in traffic and provisions the right resources accordingly.  
  • Schedule-based scaling: You can provision in advance, assigning the required machines on set dates and times. Add, edit, select, or delete schedules from the interface.
  • Load-based scaling: This feature allows you to define at what load you want the system to scale up or down.
  • Force log off: You can force lingering or inactive sessions to log off to achieve more cost savings.
  • Dynamic session timeouts: This feature allows machines to have different timeouts at different times of the day. For example, according to peak times.
  • Cost visualization and monitoring: From the autoscale console, you can monitor cost and consumption metrics, seeing in real time how much the optimization is saving you. 

What is the difference between autoscaling and load balancing?

Elastic load balancing and autoscaling are often confused because they are similar technologies. Both manage the traffic load among servers, assigning resources according to need. While many solutions include autoscaling features in their load balancers, they are used for different applications.

Autoscaling enables you to define the criteria by which the system will manage the number of instances and server resources for on-peak and off-peak traffic. An elastic load balancer, on the other hand, distributes the traffic, directing the data requests according to the health of the instance.

The solutions that combine both technologies allow you to define the autoscaling policies that will direct how the load balancer distributes the load among instances.

Benefits of autoscaling

There are several benefits of implementing an autoscaling solution compared to having a static instance that you need to scale manually:

  • Better performance: Defining autoscaling policies enables admins to set their performance level goals. Autoscaling tools then track and maintain performance according to policy. 
  • Fault tolerance: Mistakes or application or hardware problems can take a service down. Autoscaling tools monitor the health of the system, replace faulty instances, and assign resources as needed.
  • Increased efficiency: By automating the scaling process, you optimize resources assignments and increase efficiency. 
  • Cost savings: Autoscaling prevents the waste of resources caused by over-provisioning.  
  • Improved availability: By removing unhealthy instances and allocating resources according to need, autoscaling ensures a consistent provision of resources, which prevents networks from getting overwhelmed by sudden spikes in requests.

Types of autoscaling

There are three main types of autoscaling:

  • Reactive: This approach consists of scaling resources up and down according to spikes or lulls in traffic. As such, it requires monitoring resources in real time. 
  • Scheduled: Users can plan a time where more resources will be needed, such as at peak seasons for an e-commerce site. With this approach you can provision the required resources ahead of time.
  • Predictive:  This approach involves leveraging machine learning and artificial intelligence techniques to analyze traffic loads and predict when there may be an increase or decrease in demand.

There are also two main ways of autoscaling—horizontally and vertically.

  • Horizontal scaling is done in cloud-based solutions. This saves the cost of adding new physical servers and offers the possibility of adjusting the scaling ad-hoc.
  • Vertical scaling involves adding more physical servers. Typically done in infrastructure-heavy enterprises, this approach is more expensive and limited to the provider’s server capacity.

It’s important to note that the method of scaling can vary according to the components you want to scale; databases may need a different approach than bandwidth, for example. 

 

Autoscaling use cases

There are several applications for autoscaling technology, but it is most suitable for applications with variability in usage and demand. Autoscaling also simplifies cloud deployments, as it adjusts the resources as needed. It is especially helpful in hybrid environments because it enables seamless bursting to the cloud.

You can also use autoscaling to automate the response of different groups of resources to different levels of demand. By setting policies and requirements in advance, you take the burden of allocating resources away from administrators. Once properly configured, the system will manage the resources according to your preferences.

Which workloads are supported by Citrix Autoscale?

Citrix Autoscale is a feature exclusive to Citrix desktops. It optimizes the capacity of hybrid environments, enabling you to prioritize resource utilization. For instance, from the cloud console, you can set primary and secondary zones so the system will use the on-premises resources and expand to the cloud as secondary resources are required.

Autoscale provides the flexibility to scale up or down according to the method that suits you best. You can execute load-based scaling, schedule-based scaling, or combine both. Additionally, you can identify which machines are in drain state, thus prioritizing other machines that are close to full use.

GUIDE

Citrix DaaS Use Case Guide

Citrix solutions for autoscaling

Autoscale is a feature of Citrix DaaS (formerly Citrix Virtual Apps and Desktops service) that provides a way to manage your virtual machines across hybrid environments. As a versatile management tool, it allows you to reduce operational expenses while managing different cloud cases. You can effectively save cloud costs by delivering the resources only when they are needed. This ensures a better user experience by preventing downtimes and availability problems.

Citrix’s solution combines the advantages of autoscaling with load balancing algorithms. By enabling vertical load balancing, it allocates workloads into machines running closer to capacity first, therefore maximizing capacity usage.

Autoscale with Citrix DaaS helps your organization maximize the full capacity of your cloud while enhancing the user experience and improving efficiency and availability. Learn more about Autoscale with Citrix DaaS by contacting a Citrix representative.

Explore the benefits of autoscaling with Citrix Autoscale