Navigating the EC2 Instance Maze

Navigating the EC2 Instance Maze
Choosing the correct EC2 instance can be a confusing experience

Picture this: You're ready to launch an EC2 instance, and suddenly, you're faced with an alphabet soup of options - t2.micro, t3a.xlarge, m6a.4xlarge, c6i.2xlarge, x2iezn.metal. With AWS offering well over 750 instance types, choosing the right one can feel overwhelming. Don't worry - this guide will help you navigate through the maze of AWS instance types and make informed decisions for your workload.

The Foundation: Understanding Nitro

Before diving into instance types, let's discuss something fundamental: AWS Nitro. Nitro is AWS's underlying virtualization platform that powers modern EC2 instances. It's a collection of purpose-built hardware and software components that provide:

  • Enhanced security through hardware-based isolation
  • Improved performance with dedicated hardware acceleration
  • Better networking capabilities
  • More consistent performance across instance types
Pro Tip: As a general rule, prefer instance types that support Nitro. Nitro is used for all new generations and offer better performance and security features. Previous generation instances lack support for integrations such Network Load Balancers and other common features and the performance different can be noticeable.

Instance Naming Conventions


AWS instance names might look like cryptic codes, but they follow a somewhat logical pattern. Let's break down an instance name like "c7gn.xlarge":

  1. Instance Family (c): Indicates the use case (e.g., compute optimised)
  2. Generation Number (7): Higher numbers mean newer hardware
  3. Additional Capabilities (gn): Processor type or/and special features
  4. Size (xlarge): Determines resources allocated per underlying physical host

Instance Families

AWS provides groupings of instance families, which are types of hardware targeted for specific use cases.

  • General Purpose (m, t)
    • Balanced compute, memory, and networking resources. Best for applications that use these resources in equal proportions (web servers, development environments, small databases, code repositories)
  • Compute Optimised (c)
    • Optimized for compute-intensive workloads. Ideal for batch processing
  • Memory Optimised (r, u, u-1, x, z)
    • For workloads that process large datasets in memory. Perfect for databases, distributed web scale cache stores, real-time big data analytics
  • Storage Optimised (I, Im, is, d, g)
    • High, sequential read/write access to large datasets on local storage. Built for data warehousing, log processing, distributed file systems
  • Accelerated Computing (p, g, trn, inf, dl, f, vt)
    • Hardware accelerators or co-processors for graphics and data pattern matching. Designed for machine learning, video encoding, 3D visualizations
  • High-Performance Computing (hpc)
    • Optimized for high-performance computing workloads requiring high levels of inter-instance communication and network performance. Used for complex scientific simulations, financial risk modeling, weather prediction

Generation

The generation number of an instance type refers to the hardware generation. These can sometimes be consistent across different types, such as general, compute-optimised, and memory-optimised. Specialised instance types typically increment on their timelines.

For the latest generations of General-purpose, Compute and Memory-optimized instances, the generation + CPU type combination results in the same underlying physical CPU type. This trend follows for additional capabilities - you can visualise these additional capabilities as running the same hardware, but using additional nitro cards to support things like additional networking (n) or onboard storage (d).

  • m6a, c6a, r6a = AMD EPYC 7R13 Processor
  • m6i, c6i, r6i = Intel Xeon 8375C (Ice Lake)
  • m6g, m6gd = AWS Graviton 2 Processor
  • m8g, c8g, r8g = AWS Graviton 4 Processor

As a result, when moving across the same generation of instances, you can be reasonably confident you will get very similar CPU performance from two comparable instances. For example, using benchmarking tools such as PassMark in CPU-bound tests, you will get similar results for r8g.4xlarge, c8g.4xlarge and m8g.4xlarge instances. You can observe this pattern on published benchmark websites, such as RunsOn.

AWS EC2 Instances Benchmark
Compare CPU speed, pricing, and spot interruption percentages of most EC2 instance types.

This makes capacity planning and instance type selection slightly more straightforward to understand. However, it should be noted this rule does not apply to older instances; for example, the c5 uses Intel Xeon Platinum 8124M vs the m5's Intel Xeon Platinum 8175. As with all parts of EC2 instance selection, there are exceptions.

Processor Types and Architecture Indicators

For new generations, the letters after the generation number tell you essential information about the processor, for example:

  • a: AMD processors (e.g., c7a = AMD EPYC 9R14)
  • i: Intel processors (e.g., c7i = Intel Xeon Sapphire Rapids)
  • g: AWS Graviton processors (e.g., m7g = AWS Graviton 3)

This convention can break down for specialised instance types which don't come with CPU distinctions, such as the g6 or p5 instances. In these cases, instances are typically provisioned with Intel CPUs.

Additional Capabilities

Further letters highlight additional capabilities of specific instance types:

  • z: High-frequency CPU boost (ex. all-core Turbo up to 4.5Ghz)
  • b: Block storage optimised - comes with additional EBS networking capacity
  • e: High memory-to-CPU ratio
  • n: Enhanced networking capacity
  • d: Instance storage (local SSDs)
  • flex: Burstable CPU

Instance Size

The sizing approach is relatively consistent across instance types. Instance sizes affect CPU and memory allocation, but they have other impacts, as discussed below.

AWS use the term vCPU to describe an allocation of the hosts CPU, for fixed instance types (not burstable), the whole CPU core is typically dedicated to the launched instance.

The term vCPU comes with the following rules (and exceptions!):

  • For Intel/AMD, an instance is assigned a minimum of 1 CPU core (to avoid specific CPU vulnerabilities such as sidelining). This core is typically multithreaded using Simultaneous Multi-Threading (SMT). Therefore, 2 vCPU is one underlying multithreaded CPU core. This rule currently has exceptions for M7a, R7a, C7a instances, T2 instances, and m3.medium [1].
  • For Graviton/ARM, each vCPU is a dedicated single-threaded core.

The labelling for sizes follows this table for "vCPUs".

Size Category vCPU Allocation Note
medium 1 Supported by instances with no-hyperthreading
(ARM, c7a, m7a, r7a, t2, etc)
large 2
xlarge 4
2xlarge 8
4xlarge 16
8xlarge 32
12xlarge 48
16xlarge 64
24xlarge 96
32xlarge 128
48xlarge 192
metal Host Capacity

Therefore, it follows that the following instance types all have 32vCPUs:

  • m6a.8xlarge
  • c6i.8xlarge
  • r6g.8xlarge

The AMD and Intel instance types will have 32vCPU over 16 physical cores (unless you work with the exceptions listed above), and the Graviton instance type will be 32vCPU across 32 physical cores.

The hidden details that will catch you out

You've launched a c7i.2xlarge, great, but wait - writing to EBS becomes slow after some time - why is this?

There are further aspects of the hardware which are tied to instance size; these are:

  • Networking Performance
  • EBS Baseline, Maximum Throughput and I/O Operations/second
  • Allocatable GPUs or other accelerated Computing modules
  • NVMe Storage Capacity

Looking at the c7i instance type, we can see that EBS throughput is limited on smaller instance sizes.

Such information is available in AWS Documentation. There is currently a bug in Vantage.sh which shows this EBS throughput information incorrectly.

Type Networking (Gbps) EBS MB/s Baseline EBS MB/s Burst
c7i.large Up to 12.5 81.25 1250.0
c7i.xlarge Up to 12.5 156.25 1250.0
c7i.2xlarge Up to 12.5 312.5 1250.0
c7i.4xlarge Up to 12.5 625.0 1250.0
c7i.8xlarge 12.5 1250.0 1250.0
c7i.12xlarge 18.75 1875.0 1875.0
c7i.16xlarge 25 2500.0 2500.0
c7i.24xlarge 37.5 3750.0 3750.0
c7i.metal-24xlarge 37.5 3750.0 3750.0
c7i.48xlarge 50 5000.0 5000.0
c7i.metal-48xl 50 5000.0 5000.0
Important: Check the full details of the instance type you are launching to understand the limitations! It's not just CPU + Memory!

When running between .large -> .4xlarge, instances can typically be throttled on EBS throughput, networking throughput, and IOPS. This is worth bearing in mind if you have a persistent high-throughput workload.

So, how do I decide on an instance type?

With this knowledge, some key questions come to mind:

  • Does the workload require dedicated CPU allocation?
    • No - Use latest-generation burstable (t3, t3a, t4g, m7i-flex) to save costs
    • Yes - Continue
  • Does the workload have any special requirements?
    • Extra low latency instance store local SSD => Storage Optimised
    • GPU / Inference / FPGA => Accelerated Computing Instances
    • HPC workloads (200Gbps interconnects) => HPC Instances
  • Is a specific balance of CPU and Memory required?
      • Compute? 1CPU:2GB => Compute Optimised
      • Balanced? 1CPU:4GB => General Purpose
      • Memory? 1CPU:8GB => Memory Optimised
  • Are there any requirements for EBS throughput and Network Throughput?
    • This could determine the minimum Instance Size selectable!
      • Check the full specifications of the instance
    • If the size is just right, but a specific capability is required, there may be an additional capability instance type:
      • c6i.large = Up to 12.5Gbps
      • c6in.large = Up to 25Gbps (Network enhanced)

Use Vantage.sh has a great tool to review instance specs, see below the c6a.large instance type.

c6a.large pricing and specs - Vantage
The c6a.large instance is in the compute optimized family with 2 vCPUs, 4.0 GiB of memory and up to 12.5 Gibps of bandwidth starting at $0.0765 per hour.

Final Thoughts: Don't Over-Optimise Early

Remember: EC2 instances aren't permanent decisions. Start with a reasonable choice, monitor your application's performance, and adjust as needed. The beauty of cloud computing is its flexibility.

Start Smart:

  • Go with recent generations (m6a or higher is a solid starting point)
  • If cost is your primary concern, try Graviton instances (ex. r8g)
  • Benchmark your workload between Intel, AMD and Graviton to see what works best for you

Once you've gathered real-world performance data, you can optimize further and commit to savings plans or reserved instances for cost optimisation.

Check out tools like Vantage.sh for detailed instance specifications and pricing comparisons. They provide valuable insights into instance capabilities and can help you make more informed decisions.

Interested in how AWS instances translate to real-world metal? Check out my other post here!

Understanding AWS Instances
As of October 2023, AWS (Amazon Web Services) boast over 700 instance types to choose from, the largest of any cloud provider; starting in August 2006 with the m1.small instance types with 1vCPU and 1.7GiB of RAM, technology has come a long way to the latest generation 7