Navigating the EC2 Instance Maze

Choosing the correct EC2 instance can be a confusing experience

Picture this: You're ready to launch an EC2 instance, and suddenly, you're faced with an alphabet soup of options - t2.micro, t3a.xlarge, m6a.4xlarge, c6i.2xlarge, x2iezn.metal. With AWS offering well over 750 instance types, choosing the right one can feel overwhelming. Don't worry - this guide will help you navigate through the maze of AWS instance types and make informed decisions for your workload.

The Foundation: Understanding Nitro

Before diving into instance types, let's discuss something fundamental: AWS Nitro. Nitro is AWS's underlying virtualization platform that powers modern EC2 instances. It's a collection of purpose-built hardware and software components that provide:

Enhanced security through hardware-based isolation
Improved performance with dedicated hardware acceleration
Better networking capabilities
More consistent performance across instance types

Pro Tip: As a general rule, prefer instance types that support Nitro. Nitro is used for all new generations and offer better performance and security features. Previous generation instances lack support for integrations such Network Load Balancers and other common features and the performance different can be noticeable.

Instance Naming Conventions

AWS instance names might look like cryptic codes, but they follow a somewhat logical pattern. Let's break down an instance name like "c7gn.xlarge":

Instance Family (c): Indicates the use case (e.g., compute optimised)
Generation Number (7): Higher numbers mean newer hardware
Additional Capabilities (gn): Processor type or/and special features
Size (xlarge): Determines resources allocated per underlying physical host

Instance Families

AWS provides groupings of instance families, which are types of hardware targeted for specific use cases.

General Purpose (m, t)
- Balanced compute, memory, and networking resources. Best for applications that use these resources in equal proportions (web servers, development environments, small databases, code repositories)
Compute Optimised (c)
- Optimized for compute-intensive workloads. Ideal for batch processing
Memory Optimised (r, u, u-1, x, z)
- For workloads that process large datasets in memory. Perfect for databases, distributed web scale cache stores, real-time big data analytics
Storage Optimised (I, Im, is, d, g)
- High, sequential read/write access to large datasets on local storage. Built for data warehousing, log processing, distributed file systems
Accelerated Computing (p, g, trn, inf, dl, f, vt)
- Hardware accelerators or co-processors for graphics and data pattern matching. Designed for machine learning, video encoding, 3D visualizations
High-Performance Computing (hpc)
- Optimized for high-performance computing workloads requiring high levels of inter-instance communication and network performance. Used for complex scientific simulations, financial risk modeling, weather prediction

Generation

The generation number of an instance type refers to the hardware generation. These can sometimes be consistent across different types, such as general, compute-optimised, and memory-optimised. Specialised instance types typically increment on their timelines.

For the latest generations of General-purpose, Compute and Memory-optimized instances, the generation + CPU type combination results in the same underlying physical CPU type. This trend follows for additional capabilities - you can visualise these additional capabilities as running the same hardware, but using additional nitro cards to support things like additional networking (n) or onboard storage (d).

m6a, c6a, r6a = AMD EPYC 7R13 Processor
m6i, c6i, r6i = Intel Xeon 8375C (Ice Lake)
m6g, m6gd = AWS Graviton 2 Processor
m8g, c8g, r8g = AWS Graviton 4 Processor

As a result, when moving across the same generation of instances, you can be reasonably confident you will get very similar CPU performance from two comparable instances. For example, using benchmarking tools such as PassMark in CPU-bound tests, you will get similar results for r8g.4xlarge, c8g.4xlarge and m8g.4xlarge instances. You can observe this pattern on published benchmark websites, such as RunsOn.

This makes capacity planning and instance type selection slightly more straightforward to understand. However, it should be noted this rule does not apply to older instances; for example, the c5 uses Intel Xeon Platinum 8124M vs the m5's Intel Xeon Platinum 8175. As with all parts of EC2 instance selection, there are exceptions.

Processor Types and Architecture Indicators

For new generations, the letters after the generation number tell you essential information about the processor, for example:

a: AMD processors (e.g., c7a = AMD EPYC 9R14)
i: Intel processors (e.g., c7i = Intel Xeon Sapphire Rapids)
g: AWS Graviton processors (e.g., m7g = AWS Graviton 3)

This convention can break down for specialised instance types which don't come with CPU distinctions, such as the g6 or p5 instances. In these cases, instances are typically provisioned with Intel CPUs.

Additional Capabilities

Further letters highlight additional capabilities of specific instance types:

z: High-frequency CPU boost (ex. all-core Turbo up to 4.5Ghz)
b: Block storage optimised - comes with additional EBS networking capacity
e: High memory-to-CPU ratio
n: Enhanced networking capacity
d: Instance storage (local SSDs)
flex: Burstable CPU

Instance Size

The sizing approach is relatively consistent across instance types. Instance sizes affect CPU and memory allocation, but they have other impacts, as discussed below.

AWS use the term vCPU to describe an allocation of the hosts CPU, for fixed instance types (not burstable), the whole CPU core is typically dedicated to the launched instance.

The term vCPU comes with the following rules (and exceptions!):

For Intel/AMD, an instance is assigned a minimum of 1 CPU core (to avoid specific CPU vulnerabilities such as sidelining). This core is typically multithreaded using Simultaneous Multi-Threading (SMT). Therefore, 2 vCPU is one underlying multithreaded CPU core. This rule currently has exceptions for M7a, R7a, C7a instances, T2 instances, and m3.medium [1].
For Graviton/ARM, each vCPU is a dedicated single-threaded core.

The labelling for sizes follows this table for "vCPUs".

Size Category	vCPU Allocation	Note
medium	1	Supported by instances with no-hyperthreading (ARM, c7a, m7a, r7a, t2, etc)
large	2
xlarge	4
2xlarge	8
4xlarge	16
8xlarge	32
12xlarge	48
16xlarge	64
24xlarge	96
32xlarge	128
48xlarge	192
metal	Host Capacity

Therefore, it follows that the following instance types all have 32vCPUs:

m6a.8xlarge
c6i.8xlarge
r6g.8xlarge

The AMD and Intel instance types will have 32vCPU over 16 physical cores (unless you work with the exceptions listed above), and the Graviton instance type will be 32vCPU across 32 physical cores.

The hidden details that will catch you out

You've launched a c7i.2xlarge, great, but wait - writing to EBS becomes slow after some time - why is this?

There are further aspects of the hardware which are tied to instance size; these are:

Networking Performance
EBS Baseline, Maximum Throughput and I/O Operations/second
Allocatable GPUs or other accelerated Computing modules
NVMe Storage Capacity

Looking at the c7i instance type, we can see that EBS throughput is limited on smaller instance sizes.

Such information is available in AWS Documentation. There is currently a bug in Vantage.sh which shows this EBS throughput information incorrectly.

Type	Networking (Gbps)	EBS MB/s Baseline	EBS MB/s Burst
c7i.large	Up to 12.5	81.25	1250.0
c7i.xlarge	Up to 12.5	156.25	1250.0
c7i.2xlarge	Up to 12.5	312.5	1250.0
c7i.4xlarge	Up to 12.5	625.0	1250.0
c7i.8xlarge	12.5	1250.0	1250.0
c7i.12xlarge	18.75	1875.0	1875.0
c7i.16xlarge	25	2500.0	2500.0
c7i.24xlarge	37.5	3750.0	3750.0
c7i.metal-24xlarge	37.5	3750.0	3750.0
c7i.48xlarge	50	5000.0	5000.0
c7i.metal-48xl	50	5000.0	5000.0

Important: Check the full details of the instance type you are launching to understand the limitations! It's not just CPU + Memory!

When running between .large -> .4xlarge, instances can typically be throttled on EBS throughput, networking throughput, and IOPS. This is worth bearing in mind if you have a persistent high-throughput workload.

So, how do I decide on an instance type?

With this knowledge, some key questions come to mind:

Does the workload require dedicated CPU allocation?
- No - Use latest-generation burstable (t3, t3a, t4g, m7i-flex) to save costs
- Yes - Continue
Does the workload have any special requirements?
- Extra low latency instance store local SSD => Storage Optimised
- GPU / Inference / FPGA => Accelerated Computing Instances
- HPC workloads (200Gbps interconnects) => HPC Instances
Is a specific balance of CPU and Memory required?
Are there any requirements for EBS throughput and Network Throughput?
- This could determine the minimum Instance Size selectable!
  - Check the full specifications of the instance

If the size is just right, but a specific capability is required, there may be an additional capability instance type:
- c6i.large = Up to 12.5Gbps
- c6in.large = Up to 25Gbps (Network enhanced)

Use Vantage.sh has a great tool to review instance specs, see below the c6a.large instance type.

c6a.large pricing and specs - Vantage

The c6a.large instance is in the compute optimized family with 2 vCPUs, 4.0 GiB of memory and up to 12.5 Gibps of bandwidth starting at $0.0765 per hour.

Vantage Logo

Final Thoughts: Don't Over-Optimise Early

Remember: EC2 instances aren't permanent decisions. Start with a reasonable choice, monitor your application's performance, and adjust as needed. The beauty of cloud computing is its flexibility.

Start Smart:

Go with recent generations (m6a or higher is a solid starting point)
If cost is your primary concern, try Graviton instances (ex. r8g)
Benchmark your workload between Intel, AMD and Graviton to see what works best for you

Once you've gathered real-world performance data, you can optimize further and commit to savings plans or reserved instances for cost optimisation.

Check out tools like Vantage.sh for detailed instance specifications and pricing comparisons. They provide valuable insights into instance capabilities and can help you make more informed decisions.

Interested in how AWS instances translate to real-world metal? Check out my other post here!