ControlUp Automation – Faster Than a Human – Automatic Remediation

In my previous blog, I showed how to use ControlUp automation to help augment our troubleshooting ability by providing additional information.  When it detected the database had failed, ControlUp Automation automatically started several tasks; a CDF trace, packet capture, and database and connectivity tests.

Once you have collected enough traces for Citrix to analyze you are now stuck in a spot.  While we wait for the analysis and fix to come back, we need to find a way to remediate these outages.

Automatic Remediation

Using the Citrix Site Database going down as our primary issue, we know Citrix Virtual Apps and Desktops (nee XenApp/XenDesktop) can work around this issue by leveraging the Local Host Cache (LHC).  Citrix has a 90 second delay from when the database is detected as failed to when LHC fails over. Since the database outage can occur during peak logon times, we need to reduce the delay. In this particular case, enabling the “Local Host Cache” at first sign of an outage can bring the outage down from 90 seconds to just a few seconds.

With ControlUp Automation we can enable LHC immediately upon detection of the event database down event, and subsequently restoration of services when the database is detected as restored.

Script Action

I’ve created a script action to enable and disable the Local Host Cache.  This script accepts one parameter to force LHC into operation, and the absence of it sets the Citrix Broker to make its own determination of whether LHC should be used.  For the purposes of Automation, I enable the Local Host Cache. After it’s enabled by automation, manual intervention by an admin will be required to switch back to the Citrix Broker deterministic mode.  

This is to prevent flapping.  

Flapping is when something in the environment is going up and down. If automated restoration to the Citrix Broker deterministic mode was done, flapping would cause the alternating between enabling and disabling LHC Outage Mode.  Flapping could be a far worse state than leaving the LHC enabled until the problem has passed.

The process with this automation:

  1. Database outage occurs
  2. Local Host Cache is enabled
  3. Local Host Cache will remain enabled even if Database connectivity is restored.
  4. The administrator will have to decide if the problem has passed and manually set Citrix Broker as Primary, stopping use of the Local Host Cache.

The script is available here:

[cc lang=”cpp”]

 
param (
    [switch]$ForceLHCMode
 )

#requires -version 3
$ErrorActionPreference = 'Stop'
<#
    .SYNOPSIS
    Enables the Local Host Cache on a Citrix Delivery Controller

    .DESCRIPTION
    This script takes one parameter to enable the local host cache "OutageModeForced" registry value.  The absence of this parameter will cause the registry value
    to be removed.  Local Host Cache is force-enabled whe the OutageModeForced registry value is present with a value of 0x1.  When it's removed the internal logic
    of the broker determines the state.  If the database is down and you remove the key, the broker will test the connectivity and if it determines that the 
    database is down it will remain in that state.  However, if it determines that the state is up, it will switch to an "Up" state.

    .PARAMETER ForceLHCMode
    A switch that forces the Local Host Cache to be turned on.

    .EXAMPLE
    . .ForceLHC.ps1 -ForceLHCMode
    Forces Local Host Cache to activate.

    .EXAMPLE
    . .ForceLHC.ps1
    Removes the forceful enablement of LHC
#>


if ($ForceLHCMode) {
    if (-not(Test-Path  HKLM:SOFTWARECitrixDesktopServerLHC)) {
        mkdir HKLM:SOFTWARECitrixDesktopServerLHC
    }
    New-ItemProperty -Path HKLM:SOFTWARECitrixDesktopServerLHC -Name OutageModeForced -Value 1 -PropertyType DWord -Force | Out-Null
    Write-Host "Local Host Cache Mode Forced"
} else {
    if (Test-Path  HKLM:SOFTWARECitrixDesktopServerLHC) {
        Remove-ItemProperty -Path HKLM:SOFTWARECitrixDesktopServerLHC -Name OutageModeForced -Force
        Write-Host "Deterministic Local Host Cache Mode Configured"
    }
}

or the complete ControlUp Script Action is here.  Simply save as an XML and import into the Script Management window in ControlUp.

<?xml version="1.0" encoding="utf-8"?>
<ArrayOfSBADescriptor xmlns_xsd="http://www.w3.org/2001/XMLSchema" xmlns_xsi="http://www.w3.org/2001/XMLSchema-instance">
    <SBADescriptor>
        <SBAId>a5921eac-09c5-40fe-8cbd-a8e4649969e6</SBAId>
        <RootSBAId>866e7996-1ccb-48d1-bb11-e56b5b670801</RootSBAId>
        <SQN>0</SQN>
        <Name>CA - Configure LHC Mode</Name>
        <Description>Configures the Local Host Cache to be forcibly enabled or not.</Description>
        <Version>0.1.3</Version>
        <DateCreated>2019-07-10T15:54:24.9444858</DateCreated>
        <DateModified>2019-07-10T16:08:54.0987266</DateModified>
        <LastEditorName>trententtye00</LastEditorName>
        <CreatorName>trententtye00</CreatorName>
        <Status>2</Status>
        <CheckSum>gOEhy8IhbooSHkdcdBvOD+EJvDY=</CheckSum>
        <DenyCommunityShare>false</DenyCommunityShare>
        <Icon>
            <IconData>iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAAAf8/9hAAAABGdBTUEAALGPC/xhBQAAAAlwSFlzAAAOwQAADsEBuJFr7QAAABh0RVh0U29mdHdhcmUAcGFpbnQubmV0IDQuMS42/U4J6AAAAOZJREFUOE/dkUsOgjAURQsT0EpU/FLYhThAxvy6KBPXYoxxJzh1ASbO3IBTnq8NFDUa/My8zUn7Xm9v2pQ8yEYWbyB8SjbtDjfWwNv3R+4Za2jCGrBDuzta49omhkkv1UYQBJAkSUkKWVojeo7jSJ9pmnI2WvJsnSwMnPOXhGEIlFLlL7krvoGAyxik5TVvedar8Dy3DuA8g09UFIV80vcBOH4KEPq3AOYyyPO83GqW8DL8ehUgmPk+xHH8kiiKJGLtz+fVYSCapq9ob7zr2NOjajaA3lO7N9lqur7EWmmMZG8ivChCropnCGRnTkTxAAAAAElFTkSuQmCC</IconData>
            <IconType>2</IconType>
        </Icon>
        <ExecutionDescriptor>
            <AllowedEntitiesEx>256</AllowedEntitiesEx>
            <AtomicEntityEx>256</AtomicEntityEx>
            <AllowedEntities>0</AllowedEntities>
            <AtomicEntity>0</AtomicEntity>
            <ScriptData>H4sIAAAAAAAEANVW207bQBCd50r9BytFAiSCivpWtZVQEwQqlyhJS6vSBxM7xMJJaGwHopZ/75mZtXdtnJAiVKmyNvF6Z2fOmduuRzfk0wxjTB5t0Ut6gX99vlNCtxRRSgMa0Q/aoAOaQnZAIR3TIX2kE8wDzHTXtvzzeAWpkH5Sht38lmC1SXO8zfAeYdcEX96I7Aa18XUmmvehOzXrHdk5lD0h5mzVo/cYm9SD1BTIN0XDO9hzce9i/Rud0hl09OgIw11tQ5dPlxQbXCm4seZjaBxgJcb7Id4TrHjg6At7llDUvnxlr8zwe4dZC6sx3pXfQtZZNhVWsViaFb6xKFvA0oNsFxg71MfvGVC7Mn1YjgRlAhRs70ZQpUBxbfCzJUZn4xhinS164iVPvJcztnxjh++o4Dtw+DaAJxNbV5jnsbY5EEDCk+hcCcrUsJ+L1gzfdwsWoXjuElI2llNEV/EoxzoGnH+xYGRkmey3HFbZdv2ofrgs9owxnwsDRfh47BXhsODeLHk1EKSjErbHfLcavVf4ROsnlHzyTD2OTB668urN18jIPcPqXJBMRFeKWkkesLdoI9HPPp8YP3B+MLpByZM2ZqF4lDP82sQqKOI2Fn3l+krEG2mRFUclTYGssT99E2PlH0D/ram7iSBe4EtWYWL1MJYF7SzFaPMpFXypIzmQWmLc2oXm4reFYzsSzJHsWs7WL2l1vfc4y6iIsqJUlr5Y8MyvtVH16aHoCU0n2nkiYqs1x5chD3dq0LnnQ15lvmBs0GfsaVQwPuyBHfT8LsYJemEfo4uvq08ZffZrrOc8bJX+XX93u0QKzjPxUUDaY+vQt+mrYO9Ae7uyxuPiAZddeCVBhfJ5uA7PgxKXdTn4RQYv9/zTsbuyXacKXW/bGAzhScbr9stx0c+0n6h+vTl8KLDy0PzdWnH32Mb6rxKmfE8TNqawsoXM0lpvgq1v+id78BN0ndBbcO2hXx9A7lzysY0v7hl/IWd8gi6i946eVJFW2YXBs12DJH/G2BnIXei5rbrW7kuzU8jdgvGRZMBY7lOshXdrV3O98byoWPepOcnXOQmb9MU5yfYMtjLePsaNrLeAjHcHlSry6Hdhje1n0qVcn5wLh1TOb1s/jTXqKsftVe4/rP1esjs2Hb0+F/9NBi7LP7dK/5eMsPM6Rssi2aqccXq74lvMOt3TRllv8EPsvJKTwEa7XG33Raf6A8oBur5KDQAA</ScriptData>
            <IsCompressed>true</IsCompressed>
            <Interactive>false</Interactive>
            <AltCredentialsType>0</AltCredentialsType>
            <Timeout>60</Timeout>
            <ScriptType>3</ScriptType>
            <TimeoutPolicy>1</TimeoutPolicy>
            <ConcurrencyMode>2</ConcurrencyMode>
            <ExecutionContext>0</ExecutionContext>
            <Arguments>
                <inner>
                    <sxSerializedDictionaryOfStringSBAArgumentDescriptor>
                        <item>
                            <key>
                                <string>d7a6df35-8c32-44c8-811e-994fd056ef00</string>
                            </key>
                            <value>
                                <SBAArgumentDescriptor>
                                    <ArgumentId>d7a6df35-8c32-44c8-811e-994fd056ef00</ArgumentId>
                                    <ArgumentType>1</ArgumentType>
                                    <Name>Force LHC Mode?</Name>
                                    <DefaultValue>-ForceLHCMode</DefaultValue>
                                    <ArgumentNumber>0</ArgumentNumber>
                                    <ValidationString />
                                    <ValidationError />
                                    <MaskTypedCharacters>false</MaskTypedCharacters>
                                    <EntityType>0</EntityType>
                                </SBAArgumentDescriptor>
                            </value>
                        </item>
                    </sxSerializedDictionaryOfStringSBAArgumentDescriptor>
                </inner>
            </Arguments>
            <InstructionsExecutionOnComputer />
            <SupressOutputDialog>false</SupressOutputDialog>
            <DefaultSharedCredGuid />
        </ExecutionDescriptor>
        <ImportValidationString>jNIYWaS48QYo+AwVd6fQaiCcdpE=</ImportValidationString>
    </SBADescriptor>
</ArrayOfSBADescriptor>

 

Trigger

Due to the criticality of the database outage and the required intervention by an Administrator, the trigger is going to have another action assigned.  I will look at the new ControlUp Email Templates and configure one for this Script Result. The ControlUp Email template allows us to customize a tailored alert.  Since this issue could cause a Major Incident, this email alert will contain emphasis and color coding to ensure the message is received. More information on email templates can be found here.

I’ll configure the template like so:

The text for the email template:

<h1><span style="color: #ff0000;"><strong>WARNING!</strong>&nbsp;</span></h1>
<p>Citrix Database was detected as <span style="text-decoration: underline;"><strong>DOWN!</strong></span><br /></p>
Local Host Cache has been  <span style="text-decoration: underline;"><strong>ENABLED</strong></span> and will remain enabled until you run <b>"$(ScriptName)"</b> against <b>"$(CompName)"</b>, disabling Force Outage Mode.

Fault detected at: ($(Timezone)): $(ScriptStartTimestamp)

[/cc]

 

The result of the email template:

I setup the trigger as follows:

 

Now let’s watch automatic remediation in action.

Video of AA in action

 

 

 

About the author

Trentent Tye

Trentent Tye, a Tech Person of Interest, is based out of Canada and its many, many feet of snow. FUN FACT: Trentent came to ControlUp because, as a former customer, the product impacted his life in so many positive ways—from reducing stress, time to remediation, increased job satisfaction, and more—he had to be our evangelist. Now an integral part of ControlUp’s Product Marketing Team, he educates our customers, pours his heart and soul into the product, and generally makes ControlUp a better place. Trentent recently moved to be closer to family. He does not recommend moving during a pandemic.