Frankenstein – eat your heart out….

Frankenstein – eat your heart out….

OK – that heading sounds completely horrible in that context. I don’t mean that literally! “Eat your heart out” is an English idiom that is usually spoken by someone as a way of asserting some sort of superiority over the person to whom he or she is speaking.

Now that we’ve got that out of the way. If you haven’t been following my blog (thanks to my one long time blog follower😉) you may have missed the ongoing saga of my testing environment and its “Storage Spaces Direct” 2 node cluster. Let’s be frank, this thing has been through the wringer and is still running. I’m happy to send you the link to the previous blog post so that you can gain an understanding of what Storage Spaces Direct has been through and yet still persists!

To continue on with this epic saga of almost Lucas-Like proportions, I want to share the latest in how Microsoft Storage Spaces Direct continues to impress and why you shouldn’t discount this technology – particularly in its latest iteration in Windows Server 2019.

Which takes us to the start of this latest installment. A briefest of history. I have a two-node cluster running on some IBM (yes IBM before the Lenovo branding) x3650 M2 hardware in a test environment. Different memory configurations, similar CPU and disks (although I still haven’t found a replacement for the OS mirror on one of the nodes) and running in a Storage Spaces Direct cluster originally built on preview 2016 code and then upgrade to 2016 release code. The disks that form part of the Windows Storage Spaces Direct cluster (let’s call it WSSD from now on) are set up as RAID 0 individual disks which strictly isn’t supported but works on this controller. I have a couple of VMs running on the cluster over ReFS Cluster Shared Volumes. If you’re looking for more info on understanding or setting up WSSD, have a look here https://docs.microsoft.com/en-us/windows-server/storage/storage-spaces/storage-spaces-direct-overview

So, deciding it was time to upgrade the cluster to the release code of Server 2019 I went ahead and upgraded each node in turn, doing an in-place upgrade of Server 2019. I didn’t think it would work to retain my configuration, but it did…. almost. After upgrading the second node I started getting some cluster related errors and couldn’t bring the cluster resource online due to a permissions error. I thought this was the end and I’d have to rummage around for some newer hardware to continue on with my WSSD test environment or even move it to Azure!

After a bit of deeper digging it appears the Domain Controllers in my test environment had decided to go out to lunch with some nasty RPC errors (not DNS…. this time). After resolving that, and a bit of cluster cajoling, my 2019 WSSD cluster came back online. Data and configuration was still there. Virtual Machines came back online. Everything kept on working.

I remain impressed with the integrity of data and the skill in coding something that can withstand the treatment it has received in my test environment and continue to run. Not only that, I went ahead and enabled ReFS CSV data deduplication on one of the WSSD volumes for a demo and it gave me a 51% savings on the VMs running on the cluster and continues to operate as well as it can on the hardware it has.

If you’re still in the business of using some on-premises equipment in this modern ‘hybrid infrastructure’ world, I’d suggest considering WSSD on some real vendor supported hardware. Along with the management available through the Windows Admin Center (It’s a US spelling thing, OK for those of us here in Aus), the experience keeps getting better.

Just like a good Saturday morning breakfast!