In my previous blog, I showed how to use ControlUp automation to help augment our troubleshooting ability by providing additional information. When it detected the database had failed, ControlUp Automation automatically started several tasks; a CDF trace, packet capture, and database and connectivity tests.
Once you have collected enough traces for Citrix to analyze you are now stuck in a spot. While we wait for the analysis and fix to come back, we need to find a way to remediate these outages.
Using the Citrix Site Database going down as our primary issue, we know Citrix Virtual Apps and Desktops (nee XenApp/XenDesktop) can work around this issue by leveraging the Local Host Cache (LHC). Citrix has a 90 second delay from when the database is detected as failed to when LHC fails over. Since the database outage can occur during peak logon times, we need to reduce the delay. In this particular case, enabling the “Local Host Cache” at first sign of an outage can bring the outage down from 90 seconds to just a few seconds.
With ControlUp Automation we can enable LHC immediately upon detection of the event database down event, and subsequently restoration of services when the database is detected as restored.
I’ve created a script action to enable and disable the Local Host Cache. This script accepts one parameter to force LHC into operation, and the absence of it sets the Citrix Broker to make its own determination of whether LHC should be used. For the purposes of Automation, I enable the Local Host Cache. After it’s enabled by automation, manual intervention by an admin will be required to switch back to the Citrix Broker deterministic mode.
This is to prevent flapping.
Flapping is when something in the environment is going up and down. If automated restoration to the Citrix Broker deterministic mode was done, flapping would cause the alternating between enabling and disabling LHC Outage Mode. Flapping could be a far worse state than leaving the LHC enabled until the problem has passed.
The process with this automation:
The script is available here:
[cc lang=”cpp”]
param ( [switch]$ForceLHCMode ) #requires -version 3 $ErrorActionPreference = 'Stop' <# .SYNOPSIS Enables the Local Host Cache on a Citrix Delivery Controller .DESCRIPTION This script takes one parameter to enable the local host cache "OutageModeForced" registry value. The absence of this parameter will cause the registry value to be removed. Local Host Cache is force-enabled whe the OutageModeForced registry value is present with a value of 0x1. When it's removed the internal logic of the broker determines the state. If the database is down and you remove the key, the broker will test the connectivity and if it determines that the database is down it will remain in that state. However, if it determines that the state is up, it will switch to an "Up" state. .PARAMETER ForceLHCMode A switch that forces the Local Host Cache to be turned on. .EXAMPLE . .ForceLHC.ps1 -ForceLHCMode Forces Local Host Cache to activate. .EXAMPLE . .ForceLHC.ps1 Removes the forceful enablement of LHC #> if ($ForceLHCMode) { if (-not(Test-Path HKLM:SOFTWARECitrixDesktopServerLHC)) { mkdir HKLM:SOFTWARECitrixDesktopServerLHC } New-ItemProperty -Path HKLM:SOFTWARECitrixDesktopServerLHC -Name OutageModeForced -Value 1 -PropertyType DWord -Force | Out-Null Write-Host "Local Host Cache Mode Forced" } else { if (Test-Path HKLM:SOFTWARECitrixDesktopServerLHC) { Remove-ItemProperty -Path HKLM:SOFTWARECitrixDesktopServerLHC -Name OutageModeForced -Force Write-Host "Deterministic Local Host Cache Mode Configured" } }
or the complete ControlUp Script Action is here. Simply save as an XML and import into the Script Management window in ControlUp.
<?xml version="1.0" encoding="utf-8"?> <ArrayOfSBADescriptor xmlns_xsd="http://www.w3.org/2001/XMLSchema" xmlns_xsi="http://www.w3.org/2001/XMLSchema-instance"> <SBADescriptor> <SBAId>a5921eac-09c5-40fe-8cbd-a8e4649969e6</SBAId> <RootSBAId>866e7996-1ccb-48d1-bb11-e56b5b670801</RootSBAId> <SQN>0</SQN> <Name>CA - Configure LHC Mode</Name> <Description>Configures the Local Host Cache to be forcibly enabled or not.</Description> <Version>0.1.3</Version> <DateCreated>2019-07-10T15:54:24.9444858</DateCreated> <DateModified>2019-07-10T16:08:54.0987266</DateModified> <LastEditorName>trententtye00</LastEditorName> <CreatorName>trententtye00</CreatorName> <Status>2</Status> <CheckSum>gOEhy8IhbooSHkdcdBvOD+EJvDY=</CheckSum> <DenyCommunityShare>false</DenyCommunityShare> <Icon> <IconData>iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAAAf8/9hAAAABGdBTUEAALGPC/xhBQAAAAlwSFlzAAAOwQAADsEBuJFr7QAAABh0RVh0U29mdHdhcmUAcGFpbnQubmV0IDQuMS42/U4J6AAAAOZJREFUOE/dkUsOgjAURQsT0EpU/FLYhThAxvy6KBPXYoxxJzh1ASbO3IBTnq8NFDUa/My8zUn7Xm9v2pQ8yEYWbyB8SjbtDjfWwNv3R+4Za2jCGrBDuzta49omhkkv1UYQBJAkSUkKWVojeo7jSJ9pmnI2WvJsnSwMnPOXhGEIlFLlL7krvoGAyxik5TVvedar8Dy3DuA8g09UFIV80vcBOH4KEPq3AOYyyPO83GqW8DL8ehUgmPk+xHH8kiiKJGLtz+fVYSCapq9ob7zr2NOjajaA3lO7N9lqur7EWmmMZG8ivChCropnCGRnTkTxAAAAAElFTkSuQmCC</IconData> <IconType>2</IconType> </Icon> <ExecutionDescriptor> <AllowedEntitiesEx>256</AllowedEntitiesEx> <AtomicEntityEx>256</AtomicEntityEx> <AllowedEntities>0</AllowedEntities> <AtomicEntity>0</AtomicEntity> <ScriptData>H4sIAAAAAAAEANVW207bQBCd50r9BytFAiSCivpWtZVQEwQqlyhJS6vSBxM7xMJJaGwHopZ/75mZtXdtnJAiVKmyNvF6Z2fOmduuRzfk0wxjTB5t0Ut6gX99vlNCtxRRSgMa0Q/aoAOaQnZAIR3TIX2kE8wDzHTXtvzzeAWpkH5Sht38lmC1SXO8zfAeYdcEX96I7Aa18XUmmvehOzXrHdk5lD0h5mzVo/cYm9SD1BTIN0XDO9hzce9i/Rud0hl09OgIw11tQ5dPlxQbXCm4seZjaBxgJcb7Id4TrHjg6At7llDUvnxlr8zwe4dZC6sx3pXfQtZZNhVWsViaFb6xKFvA0oNsFxg71MfvGVC7Mn1YjgRlAhRs70ZQpUBxbfCzJUZn4xhinS164iVPvJcztnxjh++o4Dtw+DaAJxNbV5jnsbY5EEDCk+hcCcrUsJ+L1gzfdwsWoXjuElI2llNEV/EoxzoGnH+xYGRkmey3HFbZdv2ofrgs9owxnwsDRfh47BXhsODeLHk1EKSjErbHfLcavVf4ROsnlHzyTD2OTB668urN18jIPcPqXJBMRFeKWkkesLdoI9HPPp8YP3B+MLpByZM2ZqF4lDP82sQqKOI2Fn3l+krEG2mRFUclTYGssT99E2PlH0D/ram7iSBe4EtWYWL1MJYF7SzFaPMpFXypIzmQWmLc2oXm4reFYzsSzJHsWs7WL2l1vfc4y6iIsqJUlr5Y8MyvtVH16aHoCU0n2nkiYqs1x5chD3dq0LnnQ15lvmBs0GfsaVQwPuyBHfT8LsYJemEfo4uvq08ZffZrrOc8bJX+XX93u0QKzjPxUUDaY+vQt+mrYO9Ae7uyxuPiAZddeCVBhfJ5uA7PgxKXdTn4RQYv9/zTsbuyXacKXW/bGAzhScbr9stx0c+0n6h+vTl8KLDy0PzdWnH32Mb6rxKmfE8TNqawsoXM0lpvgq1v+id78BN0ndBbcO2hXx9A7lzysY0v7hl/IWd8gi6i946eVJFW2YXBs12DJH/G2BnIXei5rbrW7kuzU8jdgvGRZMBY7lOshXdrV3O98byoWPepOcnXOQmb9MU5yfYMtjLePsaNrLeAjHcHlSry6Hdhje1n0qVcn5wLh1TOb1s/jTXqKsftVe4/rP1esjs2Hb0+F/9NBi7LP7dK/5eMsPM6Rssi2aqccXq74lvMOt3TRllv8EPsvJKTwEa7XG33Raf6A8oBur5KDQAA</ScriptData> <IsCompressed>true</IsCompressed> <Interactive>false</Interactive> <AltCredentialsType>0</AltCredentialsType> <Timeout>60</Timeout> <ScriptType>3</ScriptType> <TimeoutPolicy>1</TimeoutPolicy> <ConcurrencyMode>2</ConcurrencyMode> <ExecutionContext>0</ExecutionContext> <Arguments> <inner> <sxSerializedDictionaryOfStringSBAArgumentDescriptor> <item> <key> <string>d7a6df35-8c32-44c8-811e-994fd056ef00</string> </key> <value> <SBAArgumentDescriptor> <ArgumentId>d7a6df35-8c32-44c8-811e-994fd056ef00</ArgumentId> <ArgumentType>1</ArgumentType> <Name>Force LHC Mode?</Name> <DefaultValue>-ForceLHCMode</DefaultValue> <ArgumentNumber>0</ArgumentNumber> <ValidationString /> <ValidationError /> <MaskTypedCharacters>false</MaskTypedCharacters> <EntityType>0</EntityType> </SBAArgumentDescriptor> </value> </item> </sxSerializedDictionaryOfStringSBAArgumentDescriptor> </inner> </Arguments> <InstructionsExecutionOnComputer /> <SupressOutputDialog>false</SupressOutputDialog> <DefaultSharedCredGuid /> </ExecutionDescriptor> <ImportValidationString>jNIYWaS48QYo+AwVd6fQaiCcdpE=</ImportValidationString> </SBADescriptor> </ArrayOfSBADescriptor>
Due to the criticality of the database outage and the required intervention by an Administrator, the trigger is going to have another action assigned. I will look at the new ControlUp Email Templates and configure one for this Script Result. The ControlUp Email template allows us to customize a tailored alert. Since this issue could cause a Major Incident, this email alert will contain emphasis and color coding to ensure the message is received. More information on email templates can be found here.
I’ll configure the template like so:
The text for the email template:
<h1><span style="color: #ff0000;"><strong>WARNING!</strong> </span></h1> <p>Citrix Database was detected as <span style="text-decoration: underline;"><strong>DOWN!</strong></span><br /></p> Local Host Cache has been <span style="text-decoration: underline;"><strong>ENABLED</strong></span> and will remain enabled until you run <b>"$(ScriptName)"</b> against <b>"$(CompName)"</b>, disabling Force Outage Mode. Fault detected at: ($(Timezone)): $(ScriptStartTimestamp)
[/cc]
The result of the email template:
I setup the trigger as follows:
Now let’s watch automatic remediation in action.
Video of AA in action