We have deployed XenDesktop 5.6 VDI on vSphere 5.0 platform recently. It has been working well mostly so far, except for a couple of times some VDI failed to power on. The status in the vSphere client task pane is Reason: Too many user. The workaround is to migrate it to another host.



I caught the following information in the vmkernel.log file. It shows the datastore DS-34 (where the VDI locates) is in lock mode 2 ( is a read-only lock, e.g. on the ..-flat.vmdk of a running VM with snapshots).

013-11-14T03:22:25.475Z cpu75:11252180)Config: 346: “SIOControlFlag2” = 1, Old Value: 0, (Status: 0x0)
2013-11-14T03:22:25.607Z cpu23:14359890)MemSched: vm 14359890: 7453: extended swap to 8192 pgs
2013-11-14T03:22:25.892Z cpu26:14359890)World: vm 14359891: 1267: Starting world vmm0:XXX-XXX-VDI-78 with flags 4008
2013-11-14T03:22:25.893Z cpu26:14359890)Sched: vm 14359891: 6217: Adding world ‘vmm0:XXX-XXX-VDI-78’, group ‘host/user’, cp                                                                          max=-1
2013-11-14T03:22:25.893Z cpu26:14359890)Sched: vm 14359891: 6232: renamed group 12460920 to vm.14359890
2013-11-14T03:22:25.893Z cpu26:14359890)Sched: vm 14359891: 6249: group 12460920 is located under group 4
2013-11-14T03:22:25.894Z cpu26:14359890)MemSched: vm 14359890: 7453: extended swap to 12550 pgs
2013-11-14T03:22:25.960Z cpu26:14359890)World: vm 14359893: 1267: Starting world vmm1:SYD-PRO-VDI-85 with flags 4008
2013-11-14T03:22:26.106Z cpu26:14359890)DLX: 3394: vol ‘DS-34’: [Req mode 2] Checking liveness of [type 10c00001
gen 27, mode 2, owner 00000000-00000000-0000-000000000000 mtime 2969702 nHld 8 nOvf 0]
2013-11-14T03:22:30.108Z cpu77:14359890)DLX: 3901: vol ‘DS-34’: [Req mode: 2] Not free; Lock [type 10c00001 offs
gen 27, mode 2, owner 00000000-00000000-0000-000000000000 mtime 2969702 nHld 8 nOvf 0]
2013-11-14T03:22:30.169Z cpu31:11252179)Config: 346: “SIOControlFlag2” = 0, Old Value: 1, (Status: 0x0)

As XenDesktop uses linked clone, the base VDI base vmdk file is accessed by multiple virtual machines. It has the possibility of locking issue. With this thought in mind, I found that High activity on disk trees that contain a lot of linked clones may lead to such problem based on VMware KB2052862. This is actually a bug in ESXi, which has been fixed in 5.5. For 5.0, 5.1, the patches are available. (ESXi 5.0 patch, ESXi 5.1 patch)