A couple of years ago we won a bid with a large financial services organization to assist them with building out a Citrix VDI environment. As part of the bid we recommended two additional products that would give this customer something they did not yet realize they needed which was deep visibility into their virtual desktop environment.
While they did not see the full value at the time, the value has since become crystal clear.
(If you prefer the non-technical version of events, skip down to the “Non-Technical Version” below).
Finding the true source of problems experienced, particularly by end users, in VDI environments can be a challenge. Such was the case with these guys when their core application began crashing seemingly at random. Accompanying this issue was a general complaint of application slowness and general unresponsiveness from the VDI environment.
The symptoms has no geographic similarity, were not tied to any particular function, and seemed to happen at any given point in the day but did seem to intensify after prolonged use of the VDI environment.
The environment monitoring showed Citrix ICA latency was <.01s on average with application latency hovering around .08s, no issue there. Internet Explorer physical and virtual memory utilization were flat even when compared to system uptime categorically proving that length of workstation uptime had no impact on the applications performance.
Digging deeper into the application performance monitoring, we were able to identify a particular module in this application that was being accessed right before the application crashed. That was the first point that could be consistently linked to the crashes.
Application performance monitoring identified CPU utilization spikes that corresponded with Internet Explorer “Not Responding” events doubling over the established monitoring baseline.
Doing our deepest dive into the applications performance we could see a small configuration change that aligned with the Internet Explorer issues, ultimately pointed to a single ill-behaving DLL file at the source of every crash.
One DLL file in one application damaged by a seemingly unrelated change negatively impacted the end user experience, hurt productivity and could have potentially damaged the favorable reputation IT carefully crafted with the deployment of the virtual desktop (VDI) environment. Fortunately, IT did not lose days of time trying to identify the problem, nor did the business lose significant productive time waiting for IT to restore proper access.
While not the $500,000 in productive time we were able to help a healthcare customer recapture, the lack of productivity lost was just as important and monitoring made that level of visibility possible. The answer was found beyond infrastructure monitoring, beyond application monitoring, squarely in the space of end user experience monitoring.
A large financial services customers end users reported a bad user experience with their key application and with their virtual desktops. The symptoms pointed in a very common direction, but deep analysis allowed us to identify a single file that had been inadvertently upgraded/modified which created Internet Explorer issues and the general desktop performance issues.
Virtual environments, including virtual desktops often have some visibility gaps into their inner workings that can make identification and resolution of root cause issues difficult and time consuming for IT departments and frustrate those that do not understand why IT can’t just “fix it.”
With the right tools we were able to strip away the mystery and identify the problem quickly and with certainty.
While VDI can create significant support savings, the impact visibility gaps can have when trying to troubleshoot virtual environments can eat those savings and negatively impact the end user experience and ultimately the end user adoption rate.
The key for this customer was having tools in place to let them “see” as their end users see, and use that clarity of vision to see exactly was at the root of their VDI problem. Instead of being completely reactive, waiting for problems to occur, the financial services organization had tools in place that let them see much deeper into their IT infrastructure and into what their end users were experiencing which gave them the ability to correlate more data points to arrive at the ultimate solution.
If you ask management or the end users, this issue will likely not be remembered as anything special if it is remembered at all, and that is exactly the way we like it.