by Administrator
20. January 2010 15:39
We have 3 servers running Windows Small Business Server 2008 as a VM under VMWARE ESXI 4.0. Once every 3 or 4 weeks the server will go down. When we log on to the VMWARE VShere Client and click on the Console tab, the server screen appears black. Attempting to restart the VM fails. The only way to get the server back is to hard power off the server and back on again.
We found that Hybernation is ON by default on SBS 2008. At this point we are not sure is this is causing the issue. To turn it off do the following:
1. Start a command prompt
2. Type: powercfg.exe /hibernate off
The hiberfil.sys will get deleted from the root of the C: drive after executing this command.
We've also disabled the CD drive from connecting automatically. This is the recommended setting by VMWARE.
1. Start the VSphere client
2. Right click on the VM and select Edit Settings
3. Click on CD/DVD and select Client Device
To increase the Video memory to stop the Event message: Insufficient video RAM
1. Click on Video Card
2. Enter total Video RAM: 16
We update ESXI 4.0 to ESXI 4.0 U1, no cure.
None of the above items corrected the issue.
We found that when the issue occurs, the datastore that the VM is located on is not longer accessible from the ESXI host. You can determine if the datastore is mounted by issuing a df -h command from the ESXI Console and it should list the file system that your datastore is on. These servers all have the Dell SAS 6/IR Controller. A ticket has been opened with VMWARE and Dell and we are now waiting for the issue to occur again and hope to determine the cause.
The issue did occur on one of our servers and we had VMWARE and Dell on the phone. They both poured thru our logs and the reason for the dismount could not be found. VMWARE requested that we replace the SAS 6/IR Controller. And we reluctantly had it replaced since this is occuring on 3 servers. Below are the actual syslog entries that led to the failure:
cpu0:10625)ScsiDeviceIO: 747: Command 0x2a to device "naa.600508e000000000bb70d48c4370670d" failed H:0x8 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
cpu0:10625)NMP: nmp_CompleteCommandForPath: Command 0x2a (0x4100040b9a00) to NMP device "naa.600508e000000000bb70d48c4370670d" failed on physical path "vmhba1:C1:T0:L0" H:0x8 D:0x0 P:0x0 Possible sense data:
cpu1:4167)MPT SAS Host:3:1:0:0 :: <6> command: Write(10): 2a 00 00 00 a0 98 00 00 01 00
cpu1:4167)<6>mptscsih: ioc0: task abort: FAILED (sc=0x41000a013200)
cpu1:4167)WARNING: SCSILinuxAbortCommands Failed, Driver MPT SAS Host, for vmhba1
cpu1:4167)<6>mptscsih: ioc0: attempting task abort! (sc=0x41000a013200)
Spoke with VMWARE tech support on at least 3 occasions and they could never figure out what was causing this issue. The last tech that I spoke to said it was most likely being caused by the SATA drives in the server and that his recommendation is to use either SAS or put the VM on an ISCSI or NFS share.
I therefore gave up on solving this issue. To solve the issue on one server I removed vmware and re-installed sbs2008. On another server since we are using a Windows 2003 Storage Server to store the data files we moved the VM to an NFS share. You can also use the Iomega StorCenter ix2-200 as an NFS server. VMWARE tech support did say that they've used it and it works well.
Check out an article we wrote on how to configure NFS using ESXI:
http://www.ct-miramar.com/BlogEngine/post/Using-NFS-with-Windows-Storage-Server-2003-and-ESXI.aspx
61ce3587-c0a9-4726-a601-87943e71e4b5|0|.0
Tags: