Out of the box, ControlUp offers some amazing technology for reporting on logon durations for users. Overall Logon Duration, Profile Load Time, Group Policy Load Time, Desktop Load Time and a column for unknown logon duration contributors are all calculated and displayed—in real time—right after a user has logged on.
This is extraordinarily helpful for identifying low-hanging fruit optimization opportunities.
Originally introduced in 2015, ControlUp (along with help from our community members—particularly Guy Leech) has been continually improving and increasing the functionality of the Analyze Logon Duration script, which can break logons down into granular phases, based on technology or logon processes.
But ALD can only report back on the things it understands.
See those two large gaps of time on the far right of the image? 33.9 seconds and 20.9 seconds?
When ControlUp customers have unexplained logon duration gaps, they can reach out to us to help them assist in troubleshooting the root cause of these gaps. From here, ControlUp Professional Services can assist in identifying the technology or reason for the gaps. This article will go into the technical details of the gap in this screenshot, what it is we discovered, and how ALD reports it after we added this technology into the script.
I discovered this while examining a customer’s logon problem. They had large gaps of time reported by Analyze Logon Duration.
To analyze sessions in your environment, right-click on a session, select “Script Actions”, “Analyze Logon Duration” and then “OK.”
Or use the ControlUp Virtual Expert™️. Just click on the “menu” icon next to “Logon Duration” and select “Analyze Logon Duration.”
Easy right? The results appeared like this:
These gaps mean that ALD is unfamiliar with whatever is happening, but is able to determine that the process took some amount of time.
I spent some time with our customer and started to examine the logs. At first, I found WMI calls occurring during the logon, but before the users group policy phase! Seriously bizarre.
I created a custom view in Event Viewer and looked more closely at the group policy log combined with the Application event log and I found a predictable pattern.
Event 4257 is “starting to download policies” and event “5257” is “Successfully completed downloading policies.” Event 8 is the start of Citrix RSoP processing, and Event 9 is “completed Citrix RSOP processing.”
With this information, I could add the start / stop times of the Citrix RSoP to the rest of the logon process output for the Analyze Logon Duration script. Re-running ALD produced output that could identify the gaps and properly assign the source phase.
I didn’t stop there though. We added the ability to track how long WMI queries took to run!
The “WMI Filters executed during logons” shows the WMI filters and the GPOs applied with them. In the Group Policy logs, when Microsoft does the processing of group policies, it follows an order of predictable events, including the GPO object it’s currently operating on. With this information, I was able to associate a specific WMI query, its GPO and when it made that query, all at the same time. This is why the bottom of this display can show the GPO name.
However, Citrix only appears to use a portion of the MS API or method of getting GPOs as it’s not generating events like Microsoft does. The portion of the MS API it is calling does generate the “starting to download policies” and “Successfully completed downloading policies” events. Since the Citrix method of examining the GPOs differs from the Microsoft method, it’s not possible to find out which GPO the CitrixCSEEngine examines while it runs its RSoP process.
OK, OK, OK. Enough of this “how it works” stuff. How does this impact YOU?!
Well… evaluating multiple GPOs with WMI filters a few times can be a massive drag on logon performance. Especially if the number of GPOs is significant. In my example, I have a few WMI filters:
SELECT * FROM Win32Reg_AddRemovePrograms WHERE (DisplayName like “%VMware%”)
SELECT * FROM Win32_Product WHERE (name like “%Citrix%”)
SELECT * FROM Win32_Processer WHERE AddressWidth = “64”
The first filter is invalid. It’s calling a namespace that doesn’t exist on the target machine because the WMI filter hasn’t been updated since it was originally set up in 2003.
The second filter is slooooow. Microsoft PFEs strongly discourage the use of Win32_Product because of its many flaws.
The last filter is one I found in the many “use these WMI filters” while searching Google. However, using ALD I discovered it incurs a massive performance penalty. It takes about one second per CPU Socket for this query. I was puzzled by why some of my systems showed around four seconds, some two, and some eight for the same query! When investigating, it appeared that this query took around one second per CPU socket. Interesting! And really, I only could discover this by using ControlUp ALD.
Finally, what about RSoP Logging? Microsoft’s RSoP logging is enabled by default. My experience has shown that this can lead to poor logon durations as RSoP logging stores its data in the WMI repository, which can cause bloat and the processing of WMI can cause slowness. I recommend disabling RSoP logging if you can and only enable it on demand when you need to troubleshoot.
Does, you might ask, the Citrix CSE Engine read the RSoP logging value and skip GPO evaluation if the RSoP logging is set to “disabled”?
I’m sorry to say that Citrix does not respect this value. It will still do RSoP evaluation.
Doing a procmon trace on the Citrix CSE Engine while it’s starting does show some interesting registry values being read:
My hope was that RSoP was triggered by the “SaveRsopToXXX” registry entries and that by setting them to disabled, it would prevent this behavior. But the only articles I could find referencing them were Known issues in Cumulative Update 8 (Citrix XenApp & XenDesktop)and Citrix Policies – Missing or inconsistent Citrix Policies in Citrix Director, both on the Citrix Knowledge Center. Neither had the values, but the information imply that 0x1 is ENABLED and 0x0 is DISABLED.
I set all three registry values to disabled and, when I relaunched my session, it still showed the Citrix RSoP stage. Perhaps one of the other registry values would disable the RSOP stages, but, sadly, it appears to persist.
And with that, the Citrix RSoP stage will stay. Even a one-second WMI filter will get processed at least three times, leading to a delay of around three seconds for each logon with the Citrix CSE Engine. And with each WMI filter query duration multiplied by three, it’s important to keep track of them.
To get insights like these in your environment, start a trial of ControlUp today (it’s FREE; there’s no sales call required!) Just connect to one of your machines and start analyzing your recent logons. If you need further assistance getting the proper data, reach out to firstname.lastname@example.org and one of our pre-sales engineers can help get your environment set up.