Wednesday 31 March 2010

Always check the backup....

So, last Friday one of the learning managers asked if there was anyway we could set up a calendar that all staff could view... A few headscratching minuets and various options were considered. I then recaled that Groupwise has calendar publishing. GR8!

By the end of the day I had it working, but that was interupted by general performance issues arround the network but mainly with the groupwise server. ( Could it be Calendar publishing?? ) Anyway by the end of the day everything seemed OKish.

Saturday I got a text to say webmail was not working, and after remotting in to the VMware ESX server I could see that all was not well with the Groupwsie box. this transpired to be the SAN volume being out of disk space. Once that was resolved things got back to normal.

Monday started off ok, untill 10ish. then groupwise started locking up. Eventualy it got to the stage that VMware could not even power of the machine!!! The ESX host had to be restarted. I hoped this was the last of it.

But on tuesday the same again. Groupwise showing 100% utilisation on the console.

I did a little more diging, shutting down all of the mail services. Still 100%

Further digging and looking at the Busiest Threads showed a number fo TSA threads running.

Hmm... TSA usualy relates to backups. I unloaded the backup exec agent with no aparent difference.

TSA threads were sill aparent.

I unloaded TSAFS, and almost straight away performance was restored. Groupwise reloaded and the server happy again.



Checkign the backup server then releaed the root casue!! A stuck backup job, which was stuck because the tape library had fallen over.


So... If you have performance issues, check your backups aint still trying to run.

Rob