Table of Contents
Introduction
Azure has a wide range of different stock-keeping units, better known as SKUs. A SKU refers to a specific version or offering of a resource that is available in Azure. The SKUs are linked to specific characteristics of the specific offering. The SKUs are divided into series to keep track, like “A” or “NV” series. These series refer to the different characteristics. In the case of the Virtual Machine series, these are directly linked to the number of CPUs, Memory, Disk, and, in some cases, GPUs. In Microsoft Azure, SKU stands for Stock-Keeping Unit.
In the context of Azure Virtual Machines, SKU refers to the different sizes and options available for the virtual machines that can be used to run apps and workloads. The available SKUs are categorized into general purpose, compute optimized, memory optimized, storage optimized, GPU, and high performance compute SKUs. Each SKU is optimized for different workloads based on CPU-to-memory ratio, disk throughput, and network performance.This research will focus on the general purpose SKU as a popular option for desktop virtualization workloads, in this research the series will be tested and compared against the various different versions.
Understanding the difference in SKU versions
Let’s break down the version of the Virtual Machines SKUs. Each series has a different version, denoted with a version number at the end of the name. In this research, we are using the D series. The D series virtual machines vary mainly on the CPU type that may be allocated, this means that the virtual machine may actually be running on different hardware.
This research is focused on the general-purpose VMs in D-series. The General-purpose VMs feature a balanced CPU-to-memory ratio, making them ideal for a variety of use cases, ranging from web servers to VDIs, for example.
| Generation | CPU Architecture | Hyper-Threading | Key Improvements |
|---|---|---|---|
| Dv2 | Broadwell–Ice Lake | No | ~35% faster than D-series |
| Dv3 | Haswell–Ice Lake | Yes | More memory/vCPU, HT efficiency |
| Dv4 | Cascade–Sapphire | Yes | 3.4 GHz turbo, AVX-512, DL Boost |
| Dv5 | Ice Lake–Sapphire | Yes | 3.5 GHz turbo, better value |
| Dsv6 | Emerald Rapids | Yes | AMX, faster storage, high vCPU count |
Source: Azure VM sizes - General purpose - Azure Virtual Machines | Microsoft Learn
Setup and Configuration
This research aims to explore the differences in performance between Azure VM SKU versions. The following SKUs are included:
| SKU | CPU | RAM | Storage Type |
|---|---|---|---|
| Standard_D2s_v3 | 2 | 8 | Premium LRS |
| Standard_D2ds_v4 | 2 | 8 | Premium LRS |
| Standard_D2ds_v5 | 2 | 8 | Premium LRS |
| Standard_D2ds_v6 | 2 | 8 | Premium LRS |
The SKU codes are deemed to have similar specifications and will be tested using the same workload. The time taken to achieve parts of the workload will be recorded alongside performance metrics to determine which SKU performs the best.
All testing took place in the Azure UK South Data Center.
Virtual machines are provisioned using Hashiscorp Terraform, the Azure Marketplace image in use is the Windows Server 2022 DataCenter Edition image. No optimisations are performed on the VM. Window Remote Management (WinRM) is enabled and Telegraf is installed to gather performance metrics.
Testing Methodology
The methodology used is different compared to the GO-EUC standard. The goal is to have a benchmark that can run independently without any required infrastructure to compare the differences in computing.
The benchmark used in this research is written in PowerShell, utilizing “fsutil” and “7-Zip” to generate load using file writes, reads, and compression. The flow is as follows:
- Create a file of a specific size
- The following file sizes are used: 4KB, 16KB, 32KB, 128KB, 512KB, 10MB, 100MB, 500MB
- Copy this file x number of times
- 4KB – 1000
- 16KB – 500
- 32KB - 250
- 128KB – 150
- 512KB – 100
- 10MB – 50
- 100MB – 25
- 512MB – 15
- For each of the above file sizes the following tasks are performed in order
- Read the contents of all these files
- Compress all these files using normal compression
- Decompress the archive created using normal compression
- Compress all these files using maximum compression
- Decompress the archive created using maximum compression
- Compress all these files using ultra-compression
- Decompress the archive created using ultra-compression
- Remove all files used
Compression and Decompression is being used because these tasks are CPU intensive tasks. As the variation between the virtual machine SKUs is the CPU model and speed, we derive a tangible difference in processing time between virtual machine SKUs.
For each step, the timings are measured in order, which are:
- Copy Files
- Read Files
- Compression_Normal
- Compression_Maximum
- Compression_Ultra
- Decompression_Normal
- Decompression_Maximum
- Decompression_Ultra
- Cleanup Files
These tests are run on each VM SKU 10 times. Eventually, the results were averaged out for each test type.
At the end of each test, the VM is stopped and deallocated; the VM is then started back up before the next test run. This ensures a high likelihood that the VM will be relocated to a different physical server and/or rack within the Azure data center. The test is started as soon as the VM has rebooted and become available.
Hypothesis and Results
Based on the specifications, there are changes in the CPU types, but overall, a performance improvement is expected with a higher version. Additionally, there will be an expected cost difference, as the newer SKU versions will have different prices. Generally new SKU versions are cheaper to run, the cheaper price is used as an incentive to consumers to migrate to new hardware and allow for decommissioning of old physical hardware within data centers.
Analysis
Timing the different stages of each test gives us an idea of the time it takes for each VM SKU, on average, to complete the various stages.
The data shows the average time taken for each task performed one the 10 test runs, all test runs were highly consistent showing little to no variation. The Standard_D2ds_v6 SKU has a slight speed advantage in some tests even though the overall specification is the same with 2vCPU and 8GB of memory. All test methods leverage the advantage of the CPU being a newer architecture CPU for the Standard_D2ds_v6. When reviewing the other virtual machine SKU test results, there are some unexpected outcomes, namely the Standard_DS2_v3, which can be linked to the SKU version as this is the oldest and, therefore, the slowest.
As explained in the introduction, each SKU has a range of CPUs, selected automatically by Azure. As a consumer of the service, you cannot influence which CPU your machine will be allocated.
During this research, the different CPU models that were allocated in the VM were noted (see below). This is a direct result of the deallocation of the VMs, which could cause the VM to be relocated to the server with a different CPU configuration.
This table shows which processor model was allocated to each VM SKU test.
| SKU | Processor Name | Count |
|---|---|---|
| Standard_D2s_v3 | Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz | 1 |
| Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz | 8 | |
| Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz | 1 | |
| Standard_D2ds_v4 | Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz | 2 |
| Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz | 8 | |
| Standard_D2ds_v5 | Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz | 10 |
| Standard_D2ds_v6 | Intel(R) Xeon(R) PLatinum 8573C CPU @ 2.30Ghz | 10 |
The Standard_D2s_v3, Standard D2ds_v4 and the Standard_D2ds_v5 SKUs were allocated the same series of processors for several of their tests, whereas the Standard_D2ds_v6 was allocated the same processor for all tests. Relating this to the measured results clarifies the timing differences.
| SKU | Processor Speed | Count |
|---|---|---|
| Standard_D2s_v3 | 2295MHz | 1 |
| 2594MHz | 1 | |
| 2793MHz | 8 | |
| Standard_D2ds_v4 | 2594MHz | 2 |
| 2793MHz | 8 | |
| Standard_D2ds_v5 | 2793MHz | 10 |
| Standard_D2ds_v6 | 2300MHz | 10 |
Reviewing the speed of the processor for each of the SKU tests, the number of tests that were run on a slightly slower CPU are observable. These details are pulled directly from the virtual machine before tests are run. The performance metrics align with the original hypothesis, the newer the SKU, the faster the processor and therefore, the task is performed quicker.
7-Zip is multi-threaded and therefore utilises all available CPU cores of the VM to perform tasks as quickly as possible. Noticeably, the Standard_D2ds_v5 and Standard_D2ds_v6 CPU metrics stop long before the other SKUs on the test because the workload is processed faster due to the newer processor architecture. However, there is a similar pattern of CPU usage between all SKUs. The Standard_D2s_v3 and the Standard_D2ds_v4 are slower with the Standard_D2s_v3 being the slowest.
The disk queue length shows the same pattern overall compared to the CPU. The workload spikes at the same period when heavy disk operations are performed. You still see which VM SKUs reach this piece of the workload first, the Standard_D2ds_v6 SKU is the first to reach this piece of the workload. This correlation in the graph only reinforces the original hypothesis, no further conclusions can be drawn from this information.
The total timing of all measurements combined shows that the Startard_D2ds_v6 is the fastest SKU for this series. The test timings show what is expected with the older VM SKUs taking longer than the newer VM SKUs.
Costs
Costs are always contributing when running virtual machines in public cloud environments. Selecting the “correct” VM SKU for the use case is crucial.
In this research, costs are considered a secondary factor.Let’s break it down by showing the SKU cost per minute.
The price per minute is not really as expected. The hypothesis is that the higher version is priced higher due to the newer CPU types, but the cost difference between the v5 and the v6 SKU is minimal.
The Standard_D2ds_v5 SKU is the same price as the Standard_D2ds_v4 machine, but you are always allocated a faster CPU for this VM SKU. The Standard_D2ds_v6 SKU is slightly more expensive and is actually allocated a slower processor in GHz but it completes workloads at the fastest pace.
The total cost can be calculated based on the total running time and the retail price per SKU version.
In this scenario the Standard_D2ds_v6 is the fastest at completing the workloads but is a touch more expensive than the Standard_D2ds_v5 SKU. In this scenario the Standard_D2ds_v5 SKU is technically the most cost effective and could provide the best performance vs price. In the context of an EUC based workload, like Azure Virtual Desktop, this could translate to more performance utilising a Standard_D2ds_v6 SKU as tasks are essentially completed more quickly.
Conclusion
It is essential to understand that when selecting a specific SKU, they will come with different types of CPUs. Each time you stop, deallocate, and start a machine, the machine could be equipped with a different CPU. Depending on your workload, selecting the correct SKU and version is essential to ensure consistency.
It is important to understand that there are multiple CPUs in a single SKU offering. In the case your workload requires a specific CPU, it is recommended to search for the specific SKU supporting that CPU. Please note, as the scope of the research was on the differences between the versions the difference between individual CPUs has not been compared.
Based on this research, the Standard_D2ds_v5 and Standard_D2ds_v6 SKUs are very close in performance for your money compared to the other versions of the Azure D-series SKUs. If you use a lower version of the v3/v4 variety, consider upgrading to the v5 and v6 SKUs to ensure reliable performance when deallocating and restarting your machines. If you are looking for pure performance the v6 SKUs will give you the best raw performance. For anyone using capacity management solutions and deallocating machines during non-business times, it’s critical to ensure consistent performance day-to-day.




