We are using AppInsight SQL to monitor our production database servers, as I am sure others are as well. We have seen on 3 separate occasions now where the query coming from Orion hangs on the server for days on end, ultimately maxing out the CPU and requiring the DB server to be restarted.
As may be seen from the image below, there are two session id's running on the monitored database server:
My DBA group tried countless times to kill these sessions, however were unable to do so. Ultimately to clear these queries and reclaim the CPU the database server itself had to be restarted. Obviously this poses a HUGE problem as restarting production machines in our environment (and yours too I'm sure) is easier said than done. Un-managing the server within Orion does not stop the query either.
Each time we have seen this occur, it has been the same query each time. This query when run manually always returns back extremely quickly, and the session is closed out once run. Typically it is the same way from the Orion monitoring platform as well, but occasionally, and unpredictably, it does not.
So far I have opened up two cases on this:
Case# 619436 - Unable to come to a resolution. Requested we enable debug logging to gather more data. Case ultimately closed as the issue did not occur within a timely manner to proceed.
Case# 634478 - Problem occurred again, different server with same specs (Windows 2008, SQL 10.0.2573.0 SP1). Support still unable to come to resolution. Apparently the debug logging does not allow them to go back far enough to when either of these hung queries were started.
So, at this point I have my management losing all faith in this product that was supposed to be a game changer in the way we monitor our environment. I have my DBA group wanting all SQL monitoring stopped so that we do not bring the server to a crawl. Application groups are also starting to question whether they want us to monitor their software now based on the effects AppInsight SQL has had on our DB servers. All the while, support cannot give me any answers. Awesome.
I caution everyone from using the AppInsight SQL monitoring, unless you do not care about adversely affecting your production environment.
In case anyone asks, our SolarWinds environment is as follows:
Primary Server: Windows 2012 R2
Additional Poller: Windows 2012 R2
Additional Web Server: Windows 2012 R2
Database Server: Windows 2008 - SQL 2012