Citrix VDA versions breakdown, a giant leap forward

Table of Contents

With each iteration of the Citrix Virtual Delivery Agent (VDA) software Citrix always treats us to new features and functionalities. For the Current Releases (CR) Citrix is constantly adding new features and functionalities while also improving the performance of the HDX protocol itself. This should, in theory, result in a better user experience, coupled with a higher user density and less strain on the system.

Disclaimer: These results have been affected by the Login VSI progress bar and results may be different in practice. For more information please read the following post.

The Long Term Service Release (LTSR) program provides stability and long-term support for XenApp and XenDesktop releases. XenApp and XenDesktop LTSRs are currently available for Versions 7.6 and 7.15. Cumulative Update 3 (CU3) is the most recent update to the 7.15 LTSR.

Since the last LTSR release, Citrix has shipped five CRs: 7.16, 7.17, 7.18, 7 1808.2 and 7 1811.1.

Version 7 1808.2 is the first version to use not only the new product naming but also the new version naming scheme. For the sake of clarity, we will refer to Virtual Apps and Desktops for all product versions in this post.

In this research we will focus on the lasted latest three CRs from that list and compare them to the lasted latest two cumulative updates for 7.15 LTSR which at the time of writing are CU2 and CU3.

This means that there are five scenarios in total:

  • Desktop OS VDA version 7.15 LTSR CU2
  • Desktop OS VDA version 7.15 LTSR CU3
  • Desktop OS VDA version 7.18
  • Desktop OS VDA version 7 1808.2
  • Desktop OS VDA version 7 1811.1

What’s new

To get some context for the different VDA versions here is a small breakdown of some of the more notable VDA improvements and new features that we suspect can have an impact on the user experience or the server scalability of the environment:

Version Feature Remark
7.18 Battery icon notification While not necessarily performance related a very much requested feature nonetheless
7.18 Enhanced server VDA webcam functionality Thinwire enhancements (‘Build to lossless’ preference of the Visual quality policy setting is now H.264 instead of JPEG for moving images)
7.18 H.264 Build-to-Lossless
1808.2 Better network throughput over high latency connection https://support.citrix.com/article/CTX125027
1808.2 Chrome enhancement to browser content redirection Browser content redirection now supports the Chrome browser in addition to the previously supported Internet Explorer browser
1808.2 NVENC video encoding support on Server OS VDAs NVENC video encoding support on Server OS VDAs. The XenApp and XenDesktop 7.17 release introduced Desktop VDA support for selective H.264/H.265 encoding with NVIDIA NVENC GPUs. In this release, the similar capabilities have now been extended to Server OS VDAs equipped with NVIDIA NVENC GPUs
1811.1 Graphics status indicator The Graphics status indicator policy has been updated to replace the display lossless indicator policy
1811.1 DPI matching on Windows 10 DPI matching allows the Windows 10 desktop session to match the DPI of the endpoint when using Citrix Workspace app for Windows.
1811.1 HDX adaptive throughput HDX adaptive throughput intelligently fine-tunes the peak throughput of the ICA session by adjusting output buffers. The number of output buffers is initially set at a high value. This high value allows data to be transmitted to the client more quickly and efficiently, especially in high latency networks. Providing better interactivity, faster file transfers, smoother video playback, higher framerate and resolution results in an enhanced user experience.

Note: from version 7.17 and onward, a new higher compression ratio MDRLE encoder has been added. The Citrix reworked the MDRLE codec so that it consumes less bandwidth in typical desktop sessions when compared with 2DRLE. More information on the Lossless Compression Codec (MDRLE) can be found here: https://support.citrix.com/article/CTX232041

Configuration and infrastructure

As always, the tests were conducted by the standard testing methodology, an in-depth look into the methodology can be found here.

The tests were configured to use non-persistent desktops delivered using Citrix Virtual Desktops (MCS) running Microsoft Windows 10 build 1809 and are configured with 2 vCPU and 4 GB RAM. Both Windows and Office are fully patched. Windows Defender was disabled, as this may influence the metrics and result in unreliable data. The image was fully optimized with Citrix Optimizer version 2 with the Citrix supplied ‘Windows 10 1809’ template.

Host Results

Based on the improvements outlined above we should expect an improvement in scalability, performance and network throughput. As usual, the first metric that we use to evaluate the scalability of the scenario’s is the Login VSI VSImax value. More information about the VSImax can be found on the Login VSI website.

vsimax

Higher is better

All Current Release VDA versions show an overall improvement in scalability of the environment to both of the LTSR versions, but from 1808.2 onward the improvement is massive. An improvement of over 30% is really exceptional and our interpretation is that Citrix worked very hard to minimize the footprint of the VDA’s in the latest versions.

cpu

Lower is better

cpu-compare

Lower is better

Because with most workloads the CPU usage is the main bottleneck, the CPU usage results are in line with the previously listed VSImax results.

memory

Lower is better

memory-compare

Lower is better

Memory consumption, on the other hand, shows a slight increase across the newer versions compared to 7.15.

reads-compare

Lower is better

writes-compare

Lower is better

command-compare

Lower is better

The storage charts all show an overall decrease in storage load for versions 1808.2 and 1811.1. It’s unclear why version 7.15 CU3 is showing a minor increase in the load compared to 7.15 CU2.

To get an idea of the user experience from a users perspective we use the tool Remote Display Analyzer to collect data from the protocol. This data is captured for each individual user but it’s important to take into account that different users cannot be compared because the duration of the Login VSI workload is different for each user. Therefore the results published are from a single user, being the first user that is active.

Within the HDX protocol, the metric framerate is reported and collected. The metric framerate is called FPS or Frames per second. In general the greater the FPS value is, the smoother the user experience will appear.

fps

fps-compare

With the FPS there is an enormous drop in the average FPS during the workload with version 1808.2 and 1811.1. As stated earlier, in general, a higher framerate is favorable because this will result in a smoother user experience.

In respect of the accuracy of the data, we validated the findings by examining the Remote Display Analyzer data for the second user:

fps-validation

Again we can see the same trend: an overall drop in FPS for 1808.2 and 1811.1. With this validation, we can conclude that the data is accurate and that we need to find a plausible explanation for the drop in FPS.

On that account let’s focus on the FPS data for 7.18 and 1808.2:

fps-vda-version

During the first part of the workload which is mostly textbased we can see that the FPS count for 1808.2 is lower, but when the part of the workload that plays multimedia content is reached there is an increase in FPS for 1808.2. Specifically, the part of the workload starting from the 43-minute mark unto the 50-minute mark (the multimedia section):

fps-vda-version-compare

Normally we would rate a low FPS count as disadvantageous because a higher FPS count would result in a better and smoother user experience. In the multimedia section of the workload, the FPS count increases in order to deal effectively with the increased screen output. Here we can see that in the 44, 46 and 48-minute marks during the workload we can see a bump in the FPS for version 1808.2.

At this section of the workload is where the multimedia content is played, that is where the newer VDA versions really start to shine and show the benefit of all the improvements into the protocol that Citrix has created.

The newer caching algorithms for the newer VDA’s result in a lower FPS and have a huge impact on text-based workloads like in the first part of the Login VSI workload. By caching the screen updates this enables the reuse of frequently-used images. Because these screen updates are cached locally by the VDA they don’t need to be transmitted and will result in a lower FPS.

In order to render the frames sent to the endpoint, the VDA’s will require computing power. We can measure the impact of the computing power required with the ‘Average CPU for Encoding’ metric from Remote Display Analyzer.

cpu-encoding-compare

Lower is better

Because of the lower overall FPS count for the newer VDA version, the CPU for encoding is also significantly lower.The chart above shows versions 1808.2 and 1811.1 have a huge drop in CPU load of almost two-thirds in comparison to 7.15.

With the drop in FPS we would expect a significant drop in bandwidth consumption as well. Fewer frames rendered means there is less to transmit over to the client.

bandwidth

Lower is better

bandwidth-compare

Lower is better

As foreseen, the overall bandwidth also benefits from the ICA improvements with a 25% percent drop in bandwidth for 1808.2. Again, the caching done by the HDX protocol can be accounted for these numbers.

Perhaps the HDX adaptive throughput can also be attributed to the drop in bandwidth consumption. With adaptive throughput, the protocol intelligently fine-tunes the peak throughput of the ICA session by adjusting output buffers.

rtt

Lower is better

rtt-compare

Lower is better

The ICA Roundtrip Time (RTT) which for a large part equals to the responsiveness of the desktop has a steady decline across the versions, which culminates to a 60% decrease in the ICA Round Trip time when comparing VDA 7.15 CU2 to VDA 1811.1. There is no debate here; a lower RTT is always better.

Launcher results

So far we’ve only looked at the data from the host perspective. As the endpoint (or the launchers in our case) need to process al information send from the VDA, their statistics are also key to take into account.

launcher-cpu

Lower is better

launcher-cpu-compare

Lower is better

As with the host metrics, the newer VDA versions have a significantly lower CPU impact on the launchers, but here the drop is 40% compared to 7.15. The increase in CPU utilization for the 7.18 version is unaccountable at the moment and warrants further investigation.

logon

Lower is better

logon-compare

Lower is better

The time-to-desktop also show a remarkable decline in average login times. This means a faster time to desktop. As one of the key metrics for end users, upgrading to the latest VDA versions will increase end-user experience and satisfaction.

Conclusion

Disclaimer: These results have been affected by the Login VSI progress bar and results may be different in practice. For more information please read the following post.

The title of this research already gave a way part of the conclusion but a small step for Citrix is a giant leap for end users when it comes to the VDA versions.

Citrix has done a tremendous amount of work on the VDA and HDX protocol fronts. We’ve seen massive improvements in both the user experience and server scalability. Citrix keeps constantly tuning the protocols with each newer version and this clearly shows in the performance of the VDA’s.

More information on the continual improvements in the HDX Protocol can be found here:

https://www.citrix.com/blogs/2018/10/30/turbo-charging-ica-part-1/

https://www.citrix.com/blogs/2018/12/17/turbo-charging-ica-part-2/

In a more real-world scenario’s where multimedia content is becoming more and more common, the impact of these improvements will only be even higher than in our lab environment. The Login VSI workload is configured with moderately static content and with a workload that is a bit outdated in terms that it only uses SD content, whereas multimedia content nowadays is typically 1080p or even 4k in resolution.

If your Life cycle management process fits within the 6 months release timeframe of CR and performance and user experience are key in your environment, based on these findings we can only recommend using CR instead of LTSR.

If you want to comment on this research and its findings, please feel free reach out. Alternatively, you can join the Slack channel and discuss with the community.

Photo by Joshua Earle on Unsplash