Troubleshoot Hyper-V Cluster Node Blue Screen Issue


We had an incident last Friday – couple Hyper-V cluster nodes went to blue screen and rebooted themselves. With the Windows debugging tool and some knowledge of Cluster, I think I have figured it out.

1) Run Windows Debug Tool, and set the symbol path: SRV*C:\Symbols*http://msdl.microsoft.com/download/symbols

image

2) Copy the dump file to your local, the dump file is located at \\Server\c$\windows\Minidump\. And run the following commands to find the crashed process:

!analyze –v

image

image

What is netft?

lmvm netft

image

According to the following records, the crashed process is rhs.exe (Resource Hosting Subsystem in Cluster)

!process fffffa80291d5b30

image

It is expected that Cluster service will reboot Windows when some critical process crashed. You can find it by running the following command on your Cluster node, and check the value of HangRecoveryAction.

cluster /cluster:<cluster-name> /prop

image

3) Now, we know the issue is about the Cluster. Let’s generate the cluster log by running the following command. And copy the log file to your local, the file is located at: \\Server\c$\Windows\Cluster\Reports\Cluster.log

Cluster log /g

4) Let’s see what happened back that time (I use trace32 to open the log file). The ISO-Images disk was deadlocked for some reason. (I confirmed with the Network admin that an abrupt network outage happened that day around that time). Why this only happened to the ISO-Image (it is in the Available Storage group), all CSV disks are fine. I think the only shared disks in Hyper-V cluster should be CSV, so we decide to remove that ISO-Image disk to prevent the issue from happening again.

Don’t forget the cluster log time stamp is in GMT format, you need to translate it to your local time.

image

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s