Welcome to a new regular feature on the ControlUp blog. Every month, we’ll feature an interview with a member of the ControlUp community that submitted a script which was subsequently published in our Script Library. This month we’re featuring Tim Riegler, Systems Engineering Manager at Cherry Health. Many thanks to Tim for his script and his interview.
ControlUp: What problem were you trying to solve with the Reset WEM – Registry Settings script?
Tim: This problem affected our users that had moved to Citrix WEM (Workspace Environment Management). Our WEM user settings include network printer mapping, network drive mapping, external applications, file type associations, and registry settings. WEM stores all of those settings as a giant XML ‘blob’ in a single registry value within the user’s registry.
What we were finding is that this XML ‘blob’ would get cut off at the end, which would corrupt the user’s registry, so mappings, settings, etc., would stop applying correctly. When that would happen, users would complain that they were missing printers or missing their network drives — all sorts of weird stuff was happening.
Initially, the helpdesk staff were resorting to profile resets for affected users. That was very impactful to our users, and time consuming for our helpdesk staff.
We built this script so that we could delete all of that cache data out of the user’s registry while they were logged in, do a WEM refresh from within their session, and verify all the user’s settings were working normally.
The script allowed our helpdesk to troubleshoot and resolve the particular problems of missing drive mappings, printers, proxy settings, whatever the case may be, without having to perform a full profile reset for the user.
Editor’s Note: The script described above was written before ControlUp Automated Actions was released.
ControlUp: I understand that this script evolved over time.
Tim: Yes. We originally wrote a script that just fixed the missing printer problem. If we were having problems with a particular security group – let’s say that there’s a printer security group that has 500 printers or something crazy like that – and it keeps getting corrupted, you can just throw that piece of code in there, whack the old settings, and not have to redo the entire user settings piece of the registry entry (or the entire profile).
When I would troubleshoot printing issues, I’d go into the registry key, pull the XML file out, and see that it’s 700 lines of code in my Notepad++! It’s ridiculous how much stuff you can shove in there, which makes it tough to find exactly where the problem is. I’ve heard that other people had similar problems, but it could have been other issues — a bad upgrade, a bug that we never were able to resolve, etc.
Eventually we rebuilt our WEM infrastructure from scratch and decided it was a lot easier to do the whole thing over. We recreated all of our settings in the new environment, and were still having similar problems.
So I eventually wrote a new script that resets all the user’s registry settings.
ControlUp: When did you do the rewrite? How long did you put up with the problem?
Tim: I wrote the script that resets all the user’s registry settings in 2017, so we put up with this problem for at least a year. Maybe longer. Some of that time was trying to figure out what was going on and how to fix it. Once we got that figured out, writing the script was much more straightforward.
Editor’s Note: The re-written script was added to the ControlUp Script Library in May 2019.
ControlUp: You described earlier which end users were affected by the problem. Can you ballpark for us how many users were affected and how much time you spent fixing user issues?
Tim: It was five to ten users a day — but it was every day for over a year. Which really starts adding up. Over that year, every user had issues at least once. Most users had issues every couple of months. Every end user was affected at some point. Prior to the script, the “fix” was to reset the profile. That’s a lot of profile resets! And it was painful! You had to get on a call with the user to determine if their registry was corrupted. You have to tell users to log out and log back in twice to make the fix, and that doesn’t make any sense to them.
ControlUp: What do you think that cost you?
Tim: Endemic downtime and annoyance for us and them! In terms of time, about an hour of lost productivity a day for the helpdesk and an hour of lost productivity a day on the user side. And it slowed down the process of serving patients.
ControlUp: So that’s two total hours of lost productivity a day where the helpdesk can’t do more important work, end users are frustrated and unproductive, and a slowdown in servicing patients.
Tim: Yes, and the thing is that this script didn’t really solve this problem, it just helped us fix the symptoms faster! We weren’t spending as much time troubleshooting. We put a process in place and said, hey, if you see these symptoms, run this script to do a WEM reset, do these tasks, boom, get the user back on her feet as fast as possible. The goal is, how do we get users back to doing what they need to do?
ControlUp: What was the final straw for you? When did you say “That’s it!”
Tim: In July 2019, during our monthly meeting of the IT managers, this one manager was going through his tickets for the month. He showed us the number of profile resets for the month and the amount of time they were spending on fixing printer problems. I don’t remember the exact number, but it was a lot. The visualization of the amount of time we were spending on that one problem was what did it.
When you’re doing your day-to-day job, you just want to solve a problem quickly. It’s always easier to work around a problem initially, especially when you’re trying to do the day-to-day and don’t have the time to change architecture, right? Solving the underlying problem requires a lot more thought and a lot more time.
But I had attended Synergy in May 2019 and learned about ControlUp Automated Actions, so I said, “Hey, I think we can do something about this.” I could fix it without going through an architecture change and it would stay fixed.
ControlUp: How long did it take you to write that script?
Tim: The actual writing of the script was not long at all. Writing it and testing it to make sure it did what I wanted it to do was no more than an hour. Some of that time was also figuring out how to handle registry keys with PowerShell. That probably took the longest part of that hour.
ControlUp: An hour? That’s it?
Tim: Yeah. It’s insane. What it really takes is time to sit back and look at… how can I make this process better? What part of it can I automate? You have to sit down and critically evaluate what you’re currently doing to understand whether or not it can even be automated, or what part can be automated. And you have to define the automation constraints. Once you figure that out, you can automate fixes fairly simply and quickly that work for a long, long time.
ControlUp: It sounds like writing and implementing the script was worth the investment.
Tim: Yeah, definitely. Yeah.
ControlUp: What was the most challenging part?
Tim: Technically, the hardest part was figuring out some of the registry pieces because I hadn’t really worked with PowerShell on the registry, and I was just learning PowerShell. But like I said the most challenging part was identifying what we could automate and then testing it.
ControlUp: Was there anything else you learned during the process?
Tim: Yes! Test, test, test, test, test! Create a test account that you can bang against before you roll to production. The second thing is to make sure you have your security policy set up so the IT staff that need to use it can. And they will love you. And users will love you.