The ControlUp Support Team tends to see actual use cases — not theoretical best practices, but how the product is being used in production environments. During a recent customer visit, the customer asked me what might be going on with VDI users in a delivery group. One set of VDI users reported lag on their VDI clients while the other group did not. Now, lag is a term that isn’t very well defined. It can mean actual network lag or just a sluggish application.
Looking at the two sets of users, none of the traditional metrics showed any cause for concern. Plenty of headroom in terms of CPU, memory and IO(PS), and no major differences in network latency.
One item that was different was bandwidth limit. The users that had been reporting lag had about 1-2Mbit of bandwidth limit while the users that did not complain had 10-30Mbit. A giant difference!
Bandwidth limit is the amount of bandwidth allocated to the remoting protocol. It’ll never be a very accurate number, but the big difference does show a problem. The users that were experiencing lag clearly had either an actual lack of bandwidth or some form of network congestion. Either way, our finding helped the customer narrow down the issue so they could remediate the problem.
I worked with Tom Fenton on ControlUp’s Technical Marketing team to help me replicate this issue in our ControlUp lab using VMware Horizon to show you the effects that limited bandwidth has on users, how bandwidth issues manifest themselves, and what VDI administrators can do to alleviate bandwidth issues and improve VDI users’ experience.
For a point of reference on how much bandwidth is required for a VDI session, the Horizon 7 Architecture Planning Guide lays out guidance for PCoIP and Blast Extreme display protocols. The network requirements vary from 100 Kbps of bandwidth for basic office end users that do not use video or 3D graphics, to 2 Mbps for end users that are running 480p video. As another point of reference, Netflix recommends a minimum of 1.5 Mbps for basic video streaming but HD videos will require 5 Mbps and Ultra HD require 25 Mbps of network bandwidth.
To show the effect limited bandwidth has on an end user session the video below shows two videos playing side-by-side. We used NetropyVE to limit the session shown on the left to 1 Mbps while the session to the right was limited only by the 1 Gbps NIC on the VDI client. You can see the video on the limited session is jerky and is missing frames, while the other session plays relatively smoothly. The sound on both sessions played without any issues as the Blast display protocol gives audio priority when streaming. More importantly, we noticed lag was limited when we interacted with applications on the session shown on the right and did not experience lag in the other session.
Using the ControlUp Management Console (CMC) to examine the network usage of the two virtual desktops, we see that while playing the video the session with limited bandwidth was ~ 1 Mbps while the unlimited sessions was consuming ~ 6 Mbps of bandwidth.
Another interesting phenomenon that we noticed was that the virtual desktop with the 1 Mbps connection was using 3 times the amount of CPU processing. This was due to the additional processing needed to render the images for display.
What is more telling is the session with limited bandwidth is showing a User Input Delay of 157 ms.
We used iPerf to quantify the amount of bandwidth available for the VDI clients. The results showed that the VDI client with the 1 Gbps connection had a max bandwidth of 596 Mbps while the 1Mbps had a max bandwidth of 1 Mbps.
The 1 Gbps connection running at ~600 Mbps rather than 1000 Mbps was expected, as we were testing TCP connections. TCP does error detection and correction which can have detrimental effects on network performance. When we used iPerf to test the bandwidth using UDP rather than TCP we got near the line speed of 1000 Mbps. PCoIP and Blast can use UDP rather than TCP.
To help identify VDI clients with performance issues you can use the Branch Mapping feature in ControlUp (see our blog on it here) to tag the VDI clients that are attached to network segments with limited bandwidth. While this approach will not alleviate network limitations, it will help you easily identify those sessions with limited bandwidth.
To remediate the limited bandwidth issue, you can try using a different protocol. To do that, use Group Policies to tune the protocol that is being used, change networking hardware to make more bandwidth available to those VDI clients, or add additional networking to minimize congestion.
For more information about using iPerf, read this blog post on Using iPerf to Baseline Network Performance.