Improving DEX: Step Two – Troubleshooting

Digital Employee Experience (DEX)

A recent study showed that only 20% of employees felt they would be supported by IT when troubleshooting technology, and if finding problems with your computer isn’t hard enough, how about finding problems on computers hundreds of miles away? Fortunately, ControlUp Edge DX gathers data from computers anywhere in the world, caches data when the computer is offline, and submits the data when re-connected to the internet. This blog does not teach you how to troubleshoot (I assume you have your methodology) but shows you a few ways to troubleshoot faster with Edge DX.

The following blog is a four-part series about improving the digital employee experience.

Improving the Digital Employee Experience blog series:

Try Edge DX free with the 50-User VIP Pack.

In the first blog, Improving DEX Step One: Alerting, I discussed why alerts are critical to improving the Digital Employee Experience. In this blog, I discuss how to use Edge DX to troubleshoot.

 

Troubleshooting with Edge DX use cases

Edge DX can troubleshoot any issue that hinders the digital employee experience. In this blog, I do not have time to go over every use case but instead give you a vision of what we can do and provide a few examples of how to troubleshoot a desktop problem. Below are a few examples of what you can quickly troubleshoot with Edge DX.

  • Problems with networks
    • High latency
    • High network usage
    • Low bandwidth
    • TCP/IP and DNS issues
    • Wi-Fi signal
  • Problems with hardware
    • High CPU temperature
    • High CPU usage
    • Low memory
    • Low disk space
    • Failing battery
  • Problems with operating systems
    • BSOD and other stop errors
    • Excessive startup applications
    • Excessive logon duration
  • Problems with local applications
    • Low MOS score for Unified Comms
    • Excessive Unified Comms cache
    • Application crash
  • Problems with SaaS and Web applications
    • SaaS availability
    • SaaS performance
  • Geographical issues
    • Web service outage
    • Geo-latency issues

 

Troubleshooting with the UI

Edge DX device details dashboard displays critical data about desktop performance to help troubleshoot many device problems. But as we can see in Figure 1, there was a historical problem with CPU and RAM until 8:20, but where do I start troubleshooting the problem?

Figure 1, Edge DX Device Details

Often, when we troubleshoot performance, we unclog bottlenecks and then find another bottleneck, so performance troubleshooting areas can be just symptoms of other problems to find the root cause. In the chart below, I list out Edge DX default performance charts in the device details section and identify when a value needs troubleshooting and where to start troubleshooting.

Variable Troubleshooting Areas
CPU Usage >than 80%
High CPU usage can cause your computer to slow down, overheat, or crash. There are many possible reasons and solutions for high CPU usage, depending on your system configuration and the applications that you use. Here are some common areas to look at to troubleshoot high CPU usage.
  • Check open processes.
  • Scan your computer for malware.
  • Update your drivers and software.
  • Adjust your power settings.
  • Disable unnecessary startup programs.
CPU Queue Length > than 2
CPU queue length is the number of threads waiting for processor time. It measures how busy the CPU is and how well it can handle the workload. A high CPU queue length can indicate a CPU bottleneck, which can cause performance issues and slowdowns. A low CPU queue length indicates the CPU has enough resources to run the processes smoothly. You can look in the following areas to troubleshoot high CPU queue length.
  • Close or end any processes that are causing high CPU queue length.
  • Update your drivers and software.
  • Adjust your power settings.
  • Upgrade your hardware.
Memory Usage
High memory usage is a problem that can affect your computer’s performance and stability. Memory, or RAM, is the temporary storage space where your computer keeps the data and programs that it is currently using. The more memory you have, the more data and programs your computer can load and run simultaneously. However, if your memory usage is too high, your computer lacks free space to load and run new programs or tasks. This can cause your computer to slow down, freeze, or crash. There are several possible causes and solutions for high memory usage, depending on your system configuration and the applications that you use. Here are some common steps that you can take to troubleshoot and fix high memory usage.
  • Close unnecessary programs.
  • Disable startup programs.
  • Increase virtual memory.
  • Scan your computer for malware.
  • Update your drivers and software.
Disk Queue Length >5
The “disk queue length” is a performance metric useful for troubleshooting and monitoring your computer’s disk drive (typically a hard drive or SSD). It represents the number of pending input/output (I/O) operations in the disk’s queue. When the disk queue length is high, it usually indicates that the disk is struggling to keep up with the volume of I/O requests, which can lead to performance issues. Here’s how you can troubleshoot and address high disk queue length.
  • Check for malware or viruses.
  • Update drivers.
  • Clean up the hard disk.
  • Check for fragmentation.
  • Monitor for background processes.
  • Upgrade to an SSD.
  • Check for hardware Issues.
  • Adjust the paging file (Virtual Memory).
  • Optimize your software.
  • Check available RAM.
Network Usage
Troubleshooting high network usage on a computer can be challenging, but you can often identify and resolve the issue with some systematic steps. Here are some areas to help you troubleshoot excessive network usage.
  • Check for background processes.
  • Scan for malware.
  • Check for cloud sync services.
  • Adjust stream quality settings.
  • Disable peer-to-peer (P2P) file sharing.
  • Check for browser extensions.
  • Check router configuration.
  • Check QoS settings.
  • Consider ISP or upgrade the network.
Network Latency >200
Network latency is the delay when data travels from one point to another on a network. High network latency can cause slow performance, poor user experience, and lost productivity. There are many possible causes and solutions for network latency, depending on the type and configuration of your network. Here are some general steps you can take to troubleshoot network latency.
  • Check your network speed and bandwidth.
  • Check your network devices and cables.
  • Check your network configuration and settings.
  • Check your network traffic and usage.
  • Check your network location and distance.
Wi-Fi Signal >65
If you’re experiencing low Wi-Fi signal strength on your computer, there are several steps you can take to troubleshoot and improve the signal. Here are some areas to help troubleshoot low Wi-Fi signals.
  • Check signal strength.
  • Check router placement.
  • Reduce interference.
  • Change the Wi-Fi channel.
  • Update router firmware.
  • Check network security settings.
  • Try a Wi-Fi extender.
  • Upgrade router.
  • Adjust router antennas.
  • Reduce Wi-Fi noise.
  • Optimize Wi-Fi settings.
  • Check for firmware updates on the computer.
  • Try a wired connection.
  • Consider a mesh Wi-Fi System.
Open Handles

Open handles are references to files, folders, registry keys, or other resources used by a process or program on your computer. If there are too many open handles, it can cause performance issues, errors, or crashes. Here are some steps to take to troubleshoot Open Handles.

  • Identify which processes or programs have the most open handles.
  • Close or end any processes or programs with many open handles that are not essential for your current task.
  • You may need to restart your computer if you cannot close or end a process or program with many open handles.
  • If the problem persists after restarting your computer, you may need to update or uninstall any drivers, software, or hardware causing the open handles issue.
Context Switches

Troubleshooting “context switch” issues in a computer typically involve investigating performance problems and system resource allocation. Context switching occurs when the CPU switches from one process or thread to another, and excessive context switching can lead to degraded system performance. Here are some steps to troubleshoot context switch-related problems.

  • Check for malware or rogue processes.
  • Reduce background processes.
  • Update and optimize software switching.
  • Adjust thread priorities.
  • Check for hardware issues.
  • Review system logs.
  • Review CPU affinity.
  • Upgrade or replace hardware.
  • Consider virtualization and containerization
Peak User Input Delay >70

High user input delay, or input lag, can be frustrating and impact the user experience on a computer. It’s the delay between inputting a command (e.g., clicking a mouse or pressing a key) and the computer responding. Here’s how to troubleshoot and reduce high user input delay.

  • Check for software updates.
  • Monitor resource usage.
  • Close unnecessary background apps.
  • Check for malware.
  • Optimize startup programs.
  • Reduce visual effects.
  • Adjust mouse and keyboard settings.
  • Check for input device issues.
  • Update graphics drivers.
  • Check for display refresh rate.
  • Adjust graphics card settings.
  • Free up disk space.
  • Defragment hard drive.
  • Upgrade hardware.
  • Check for network issues.
  • Test on a different user profile.
  • Perform a clean install.

 

Troubleshooting with scripts

There are multiple ways Edge DX can use scripts to help find problems with computers.

  • Run a troubleshooting script from a remote computer’s shell.
  • Run a troubleshooting script and have the results displayed in the device events.
  • Run a troubleshooting script and have the results displayed in a custom Edge DX database.

To add and create scripts in Edge DX, click the configuration icon in the upper right corner of the screen and then click Scripts.

You can delete scripts From the Scripts pain, add scripts from the ControlUp library, and create new ones. Edge DX supports all major scripting languages such as PowerShell, VBScript, Jscript, Command Script, Python, Python 3Bash, Swift, and Shell Script. When creating a script, you need to identify the platform, the scripting language, if an alert can trigger the script, any grouping permissions, timeout, and data collection, which we will discuss later.

Create a script to run on a remote computer.

Edge DX can run a remote shell in a system context on Windows, macOS, and Linux computers by selecting Remote Shell from the assist drop-down menu on a device’s detail dashboard. For more information on the remote shell, see Remote control options for Edge DX.

Let’s say you have a user who complains the computer takes an extraordinary time to startup. You have determined that the problem could be the number of applications automatically starting up. I’m not very good at scripting; however, I can use ChatGPT to create a PowerShell script to list the startup applications for a Windows computer, as seen below.

# Get a list of startup applications using WMI
$StartupApps = Get-CimInstance -Namespace "Root\CIMv2" -Class Win32_StartupCommand
# List the startup applications
Write-Host "Startup Applications:"
foreach ($app in $StartupApps) {
Write-Host "Name: $($app.Name)"
Write-Host "Command: $($app.Command)"
Write-Host "Location: $($app.Location)"
Write-Host "User: $($app.User)"
Write-Host ""
Write-Host "------------------"
}

Figure 2 shows the script running in a remote shell against a Windows PC. The script’s output shows three programs that run at start-up, OneDrive, MicrosoftEdgeAutoLaunch, and SecuirtyHealth, along with the user context, such as username or public for all computer users.

Figure 2, Edge DX Remote Shell with PowerShell Script

Running a troubleshooting script in the background of a remote computer is a great way to find the root cause of a problem without affecting the employee’s productivity.

For more information on excessive logon time, see Tom Fenton’s blog: Got Slow Logons? Fix Them Fast With ControlUp For Physical Endpoints & Apps! – ControlUp

Create a script to output results to Edge DX events.

Edge DX can run a predefined script on one or many computers and output the results to the Device Events tab of the computer details. To manage scripts, click the configuration icon in the upper right corner of Edge DX, then click Scripts. From there, you can add scripts from ControlUp’s script library or add and remove scripts, as seen in Figure 3.

Figure 3, Edge DX Scripts

In this example, we took the same script that we ran on a remote shell but changed the output to be displayed in the events tab in Figure 4.

# Get a list of startup applications using WMI
$StartupApps = Get-CimInstance -Namespace "Root\CIMv2" -Class Win32_StartupCommand

# List the startup applications
Write-Host "Startup Applications:"
Write-Output("### SIP EVENT BEGINS ###")
foreach ($app in $StartupApps) {
Write-Host "Name: $($app.Name)"
Write-Host "Command: $($app.Command)"
Write-Host "Location: $($app.Location)"
Write-Host "User: $($app.User)"
Write-Host ""
Write-Host "------------------"
}
# Set output encoding to ensure non-ASCII characters are captured
[Console]::OutputEncoding = [System.Text.Encoding]::UTF8

# Write output

Write-Output $arrPrinters
Write-Output("### SIP EVENT ENDS ###")
Exit 0
catch {
Write-Output("### SIP EVENT BEGINS ###")
Write-Output -InputObject "There was a problem retrieving the startup apps."
Write-Output("### SIP EVENT ENDS ###")
Exit 1
}

Once the above script is saved in the script library, it can be run simultaneously on one or many devices. In Figure Five, you can see the output of the above script in the Device Events for the computer called JJ-NUC-B-11. In the DESCRIPTION column, we can see the output of the startup applications.

Figure 4, Edge DX Device Events

Saving troubleshooting scripts to the Edge DX script library is a great way to share troubleshooting scripts to other IT support technicians.

Create a script to output results to an Edge DX database.

When troubleshooting reoccurring problems, the output can be written to the Edge DX database. When writing data to a custom index, you must select the Sends Data checkbox and then specify the name of the Data Index where the data will be written. In Figure Six, you can see the script configuration area of Edge DX, where the Platform is Microsoft Windows, the Language is PowerShell, the Trigger is set for Custom-Action – System, the Timeout is set for 60 seconds, and the Sends Data is selected as well as the Data Index is set for startup_apps.

Figure 5, Edge DX Script Options

The code below will gather all the startup applications and prepare the output for an entry into the Edge DX Data Index startup_apps as defined in the script’s configuration.

try {
# Get a list of startup applications using WMI
$StartupApps = Get-CimInstance -Namespace "Root\CIMv2" -Class Win32_StartupCommand# Prepare data for Edge DX
$DataOutput = @()
foreach ($app in $StartupApps) {
$DataOutput += @{
Name = $app.Name
Command = $app.Command
Location = $app.Location
User = $app.User
}
}# Write data
Write-Output "### SIP DATA BEGINS ###"
Write-Output ($DataOutput | ConvertTo-Json -Depth 10)
Write-Output "### SIP DATA ENDS ###"# Write event (optional)
Write-Output "### SIP EVENT BEGINS ###"
Write-Output "Startup applications list retrieved successfully."
Write-Output "### SIP EVENT ENDS ###"
Exit 0} catch {
# Handle exceptions and write error event
Write-Output "### SIP EVENT BEGINS ###"
Write-Output "There was an error retrieving the startup applications."
Write-Output $_.Exception.Message
Write-Output "### SIP EVENT ENDS ###"
Exit 1
}

Like in the previous example, we ran the above code on a computer, but this time, the results are stored in an index called startup-apps, as seen below in figure 6.

Figure 6, Edge DX Custom Data

IT professionals must find the root cause of desktop computers to keep people productive. Edge DX helps IT identify the root cause of problems with standard procedures, an intuitive user interface, and the ability to use scripts to get more information.

Keep reading to understand how ControlUp Edge DX is used to remediate issues, even with automated remediation.

More information:

Improving the Digital Employee Experience blog series:

Jeff Johnson

Jeff is a product marketing manager for ControlUp. He is responsible for evangelizing the Digital Employee Experience on physical endpoints such as Windows, macOS, and Linux. Jeff has spent his career specializing in enterprise strategies for client computing, application delivery, virtualization, and systems management. Jeff was one of the key architects of the Consumerization of IT Strategy for Microsoft, which has redefined how enterprises allow unmanaged devices to access corporate intellectual property.