I started migrating our physical machines to VMs using VMware a few years ago and the first problem I ran into is still the most annoying one: the size limit for LUNs is, per VMware’s docs, (2TB – 512B). That’s 512 bytes shy of 2TB, so basically 1.99999 TB, or 2047.999 GB. So when I create a new LUN for a datastore in the SAN the max size is 2047 GB. Now, as the VMware KB article states, this is a limitation of SCSI, not VMware per se, but that doesn’t make it any less annoying. When I first setup ESX, I created a 5 TB LUN for the datastore. It showed up in vCenter as 1 TB. After some Googling I learned of the 2 TB limit — the usable space is basically usable space = (size of lun) % 2TB, where % is the modulo operator — and found something suggesting using extents to expand the datastore across luns. I did that, but I later learned that there seems to be a consensus that extents should be avoided.
There are other things I learned along the way – that you want to limit the number of vmdks per datastore anyway, for example, due to the risk of SCSI reservation errors and IO contention, but these are all things that it feels like we shouldn’t have to worry about. I can see having separate LUNs/datastores for different logical groupings of disks, allowing you to have different snapshot schedules for each datastore, or allowing you to put an entire datastore in Tier 1 or Tier 3 (to use Compellent parlance) based on its value to you. But having to segregate stuff for technical reasons seems like a problem that should already be solved.
And maybe it is… I’ve never tried NFS datastores, but if I created an 8TB LUN, mapped it to a physical box (skirting the 2TB limit imposed by VMware), export the volume from the host over NFS and use that as the datastore, I guess I’d be able to do all the things I want. Hmm. I’ll have to think about that. I guess I’d still keep the ESX host local luns on iSCSI so they could boot from SAN, though I suppose when we move to ESXi that won’t be much of an issue anyway.
Hmm… Well, I started writing this as a rant but I think I just morphed it into a new research project.
Hi Evan, here’s some feedback.
Extents — there are two common “legitimate” use cases:
1) There’s the 2+ TB LUN case. You’ve pointed out the downsides but there might be no choice in some situations, e.g. if you need additional storage but have got an oversubscribed FC SAN fabric, it might be easier to just add extents to an existing LUN than to do the zoning, masking, etc. that would be needed to add a brand new LUN.
2) Besides enabling 2+ TB LUNs, another use case for extents is as a form of cheap wide striping to increase performance if your hardware isn’t natively architected that way (if you’re using Compellent arrays, you’re in good shape as they automatically disperse blocks over multiple disks). That is, while you don’t gain any benefit in terms of LUN queueing or SCSI reservations (as you note) you do gain the benefit of more physical spindles.
NFS — it’s a great option. Keep in mind, though, that although it doesn’t suffer from the SCSI queue limitations of VMFS, it has its own limitation in that ESX can’t natively multipath to it — there’s just a single TCP connection per NFS datastore. If you’re lucky enough to be running 10 GigE this isn’t a problem but for typical 1 GigE environments, you need to take this scaling limit into account. (Some vendors offer workarounds for the single TCP connection issue, e.g. Netapp can be set up to export the same NFS datastore over multiple virtual IP addresses (VIFs). If you’re thinking of using NFS you should check with your array vendor whether it offers such a capability.)
Thanks for the insight, Jeremy. Throughput to NFS is certainly the major downside I can see (and we are using 1gig Ethernet). I’m not sure it would be worth the effort to move from iSCSI to NFS at this point, since it would really just be for convenience and ease of management, and it’s something I only have to touch once every month or two at this point.