Horizon Instant clones were (and are) one of the most transformative technologies VMware has ever implemented. Using instant clone technology, virtual desktops can be created in seconds; sure, they do have some limitations, but these are greatly outweighed by the benefits that they provide.
The technology behind VMware Horizon instant clones was first announced at VMworld 2014 as “Fargo” or “VMFork.” Instant clones allow a running virtual machine (VM) to be “forked” to a child machine. This child VM initially uses the same real memory and disk space as its parent. When a write to the memory or disk occurs, though, it stores this data in a separate space, saving RAM and disk space. Because it forks a running VM, the clones can be created in just seconds.
The first product to use this technology was Horizon 7, which was released in Q1 2016.
I used ControlUp Console to look at one node on which an Instant clone desktop pool was running; it showed the user desktops (CALLCENTERxxx) and the other objects used by instant clones. What I found interesting was that the active memory of the user desktops was only a fraction of what a non-instant clone VM would useg—they were using the CP-Parent VM’s memory as much as possible.
Before the advent of instant clone technology, virtual desktops were, for the most part, persistent. Due to their characteristics, however, instant clones were considered ephemeral as, until recently, they were destroyed every time a user logged off.
The ephemeral nature of instant clones and the speed with which they can be created means that admins can spin up virtual desktops on the fly, so precious resources (CPU, RAM, disk) aren’t consumed until they are needed.
Their speed and flexibility means they are more likely to have the latest software and, thus, will be unencumbered by extraneous information that builds up over time. That said, desktop users still need persistent storage and VMware has addressed that with App Volumes, Dynamic Environment Manager (formerly known as VMware User Environment Manager), and other mechanisms that handle the disaggregation of data from a persistent desktop (VMware calls this “just-in-time” or “composable” desktops).
Before jumping into how to use ControlUp to monitor and work with Horizon instant clones, we should look at the underlying architecture of instant clones and how they are created. An instant clone is based on a master VM. This VM needs to include the operating system, VMware tools, and any applications [that you will not be attaching to it via App Volumes (or some other mechanism)] installed.
Install Horizon agent on the master VM. When doing this, be sure to deselect the linked clone feature and select the instant clone feature, as they are mutually exclusive.
After you create a master VM, take a snapshot of it.
Once the master VM is created, you can use the Horizon Console to create a desktop pool of instant clones.
When you create a desktop pool, simply specify the master image, snapshot, compute cluster, and datastores you want to use.
From the master VM, a template is created, followed by a replica and then a parent VM for each datastore, for each host.
The template is a clone of the Master VM with the snapshot and has a naming scheme of cp-template-GUID in vCenter in ClonePrepInternalTemplateFolder. The template will be on the same datastore as the Master VM; there is 1 Template VM.
The Replica is a full, thin-provisioned clone of the Template and has the naming convention in vCenter under ClonePrepReplicaVmFolder. There will be one Replica on each of the datastores you specified when you created the desktop pool.
The parent is used to fork the desktop VMs and place one per host, per datastore. The parent’s naming convention is in vCenter, under ClonePrepParentVmFolder. There will be a parent on the datastore for each host; this protects the VMs if a host goes offline.
The instant clone desktops are placed on the datastore you specified when you created the desktop pool. Each desktop will start out small (approx. 2.5 GB) and grow as more information is written to it.
To show how this manifests itself on an actual system, I created a desktop pool (InstClonePool) with nine instant clone VMs (InstCloneDTX) running on a cluster (NUC01) with two hosts (NUC01 and NUC9); I used two local datastores on each host. The parent VM (NUC01_ICMaster) lived on a datastore on NUC9Pro01. The desktop VMs were in a folder named NUC01-ICPool, and used resources from the resource pool NUC01RP_IC01.
By watching Recent Tasks in the vSphere Client, I saw it create the objects needed for instant clones. Creating the instant clone pool took about 15 minutes. Most of this time was spent creating the templates and replica objects; VMware refers to this as priming.
I wanted to see how long it would take to create instant clone desktops from a desktop pool that was already primed. I disabled the pool and deleted the instant clone desktops. Once I re-enabled provisioning, nine instant clones were created in less than 20 seconds.
From the vSphere Client, I could see the cp-template, cp-parent, and cp-replicaVMs in their respective folders.
To help visualize what was going on, I connected to an instant clone desktop and created a chart showing the location and resource usage of the instant clone objects on two of the four datastores.
When I logged off a desktop, the instant clone desktop was deleted, recreated, and available for use in under five seconds.
The lifecycle of a desktop varies and they must be updated, patched, and otherwise modified for security and functional reasons. To do this to an instant clone desktop pool, the master VM is powered on, modified, powered down again, and a new snapshot is taken. Then, from the Horizon Console, the desktop is selected and Schedule is selected from the Maintain drop-down menu.
The Schedule Push Image dialog allows you to select a different snapshot or even a separate master image.
Select the time you want the new image to be placed into service, and choose whether you want to force log-off users or if you’d rather their desktops be recreated next time they log off.
Usually, having ephemeral desktops isn’t an issue, but some users need desktops that can survive a log-off. To address this issue, VMware incorporated longer-life instant clones in Horizon v7.9 that aren’t destroyed after a user logs off. The switch used to create a longer-life instant clone desktop pool is located under the Desktop Pool Settings dialog. The drop-down menu Refresh OS disk after logoff has three options: Never, Every, and At.
These settings allow you to specify the way you want the desktop to be refreshed: always, after a set amount of days, or when the disk usage reaches a set percentage, or never. NOTE: VMware advises not to use the Never setting.
To test this, I enabled my instant clone to refresh after 15 days. I then logged into my instant clone desktop and changed the background of my virtual desktop to High Contrast White and then disconnected and logged off the desktop, an action that would have previously deleted the old desktop and a new desktop would have been created when I logged back in.
When I logged back into my instant clone virtual desktop the white background indicated that this was not a new desktop but the one I had previously used. I also monitored my desktop from the Horizon administrator and found that when I logged out of the desktop the desktop status was assigned but not connected, a state that I would have not seen before for an instant clone.
Instant clone technology can generate hundreds of virtual desktops in no time and these desktops can be placed on a wide variety of hosts and datastores. Monitoring and troubleshooting them can be a monumental task that can overwhelm even the most competent system administrator. ControlUp has spent the last decade helping virtual desktop administrators effectively and efficiently monitor thousands of desktops, including instant clone desktops.
ControlUp is a VMware partner and our products can be downloaded from MyVMware under Horizon Service.
I added the agent to the master instant clone image following the instructions in the ControlUp Getting Started Guide under the section Adding a ControlUp Agent in Horizon Environment.
When an EUC environment is added to ControlUp Console all desktop pools, including instant clone pools are added to the organization tree.
If the ControlUp agents were not installed in the master image, metrics from the desktop will not be available, but information that can be gleaned from vSphere and Horizon will be shown. The column with Horizon information is prefixed by HZ.
You can add the agent to the desktops by clicking Install Agent or Add Machines. Once added metrics, such as CPU, from inside the desktop will be shown.
However, when a desktop pool is expanded the new machines must be added again, therefore it is best to include the ControlUp agent in the master image. You can do this by adding the ControlUp agent to the master image and then schedule an image push.
Once an instant clone desktop has the ControlUp agent has the agent installed in it and the Horizon EUC environment has been added to the ControlUp Console it can be monitored and managed like any other Horizon desktop.
From the default view in the ControlUp Console for Horizon desktops in the aggregate view, you can see information regarding the pool and its overall health.
From the individual machine view, you can see the Horizon state, pool name, power state, agent (if it is assigned to a user), and other Horizon specific information.
You can see resource usage (CPU, RAM, etc.) for the desktops from the default column preset view for Horizon desktops. If a more detailed resource usage view is required, you can select Virtual Machines from the Column Preset drop-down menu.
Using instant clones, you can create hundreds of desktops from the same master image. Using the ControlUp Console you can perform real-time, multi-dimensional analysis using its grouping function. For example, if you have a user complaining about poor performance, you can compare the resources that it is using to other desktops by using the grouping function in the ControlUp Console. By doing this, you may be able to use correlation to causation.
In the example below, I limited the scope of my investigation by focusing on a specific desktop pool and then grouped by Stress Level, by dragging the Stress Level column header to the group location. I expanded the group of machines that had a stress level of 3 (medium) and I saw that the CPU utilization was at 100%. By doing this, I could comb through hundreds of machines to find the machine(s) with an issue(s).
Root cause analysis of the high CPU usage can be identified by clicking on the Process tab and sorting on the CPU column. By doing this I could identify the process that was using the CPU cycles.
To see if this machine’s high CPU usage was impacting other machines on the host, I clicked the Machines tab, clicked the X next to the previous grouping and then grouped by the Host Name. Expanding the groups, I saw that host had four machines and even though one machine was at 100% CPU usage, the CPU ready for all the machines on this host was at 0%, so there was plenty of CPU available for the other machines.
I further verified, by selecting the UX Score from the Column Preset drop-down menu and verifying that the User Input delay fields were not unusually high, that the users on the other machines on this host were not experiencing latency.
If the process that was consuming 100% of the CPU cycles was, in fact, causing issues with other users, I could have killed the process from the ControlUp Console by right-clicking it and selecting to kill it or use another mechanism to limit its CPU consumption.
Being able to diagnose and solve issues in real-time is a major use case for ControlUp but its power can be harnessed even when you are not around with the use of triggers and scripted actions. Using triggers you can monitor your environment and then specify (using a script) what should happen. A prime example of this is monitoring a Horizon Connection Server and if an event happens you can specify the action to take place to correct the issue.
Scripts do not need to be set off by triggers and can be used to automate your activities. You can get a list of ControlUp and community created scripts. One of the most innovative, popular, and helpful scripts is Analyze Logon Duration which shows the events that affect a user’s logon time.
To use a script, select Script Actions and then select the script you want to run.
Running the script mentioned above—Analyze Logon Duration—shows how much time is spent when a user logs on.
VMware instant clones are a powerful technology that is changing the way that we and use virtual desktops. However, to be used effectively, they need to be closely monitored and if an issue does occur you need a powerful tool, such as ControlUp to troubleshoot the issue. By using powerful features in ControlUp, such as grouping to do multidimensional analysis you can quickly focus on issues. This not only detects issues that a single machine has but it can show you if an issue is affecting other machines.