Storage Spaces Stripping and Mirroring

Whenever I’m involved in a discussion about the Storage Spaces 3 major issues are always present:

  • Performance related to disk layout
  • Storage Pool Expansion – especially tiered one
  • Parallel rebuild not working

The technology itself reached maturity, but you have to take care about few things when you deploy it. In this post I’ll try to explain what is the Storage Spaces Stripping and Mirroring and how is affecting the above mentioned issues. This is not an article that will cover everything about the Storage Spaces. If you have never worked with the technology I suggest you first visit the Storage Spaces Frequently Asked Questions (FAQ). Storage Spaces offer two types of resiliency: Mirror and Parity. There is also a third so called resiliency Simple, but it’s not resilient to disk failures at all. Here I will discuss only Mirror resiliency, since it is the resiliency you would use for production workloads like Hyper-V or SQL. There are two types of a mirror: two-way mirror and three-way mirror. First one creates one more copy of your data, while the other creates two more copies of your data. With the other you can use only 1/3 of available disk space and I have never used it in production.

Stripping and Mirroring

When you create the Storage Spaces they will try to create such a layout to achieve maximum performance by default. They will try to stripe the data across multiple disk to be able to read/write simultaneously. Stripping is defined with NumberOfColumns and Interleave where:

  • NumberOfColumns represent the number of disks across which stripe is written.
  • Interleave represents the amount of data written to a single column (default 256 KB).

Stripe size or stripe width, which represents the amount of data written in one pass to a Storage Space, is NumberOfColumns * Interleave. More columns (disks) the better performance you will have. If you use a GUI the maximum NumberOfColumns is 4 and Interleave is allways 256KB.  When you create a Storage Spaces from PowerShell you can control the number of columns and the interleave with PowerShell cmdlet New-VirtualDisk with the NumberOfColumns and Interleave parameters.

Since we definitely need resiliency for the production workload, besides the stripping we will combine it with the mirroring. It allows us to write another copy of data so we can survive disk failure. To create a two-column two-way mirror you will need a minimum of 4 physical disks. Storage Spaces will stripe the data to two physical disks and then it will write another copy of data on the other two disks. This is very similar to RAID 0+1, but depending on the number of disks in the pool and the number of spaces created, it will use 4 disks, but not the way RAID controllers usually do. That’s why when talking about the Storage Spaces RAID terminology is not used. This type of resiliency can sustain the failure of a single disk.

Examples:

Let’s sum up all the above in a few examples to make it more clear. Minimum number of disks you need is 2 * NumberOfColumns, because after the stripping we need the same size of disk set to write the copy of data – you remember.

6 disks, 3-columns 2-way mirror
With 6 physical disks to achieve the maximum performance we will use 3 columns in two-way mirror. The data will be first stripped to three disks and then copied to another three disks.

Storage-Spaces-p2380-fig01

 Figure 1 – 6 disks, 3-columns 2-way mirror Storage Spaces layout
 

6 disks, 2-columns 2-way mirror
In this example we are not using the maximum performance since we are stripping data “only” to two disks simultaneously, but with this layout you will need less disks to expand the storage pool.

Storage-Spaces-p2380-fig02a

 Figure 2 – 6 disks, 2-columns 2-way mirror Storage Spaces layout

Storage Tiers

Storage tiers allow us to create the Storage Spaces on top of the pool which combines SSDs and HDDs. Data is moved on the subfile level between the two tiers, based on how frequently the data is accessed. The most used, so called “hot” data is moved to the SSD tier, while major amount of data, infrequently accessed, is kept on the HDD tier. This way we can have the performance and capacity of inexpensive HDDs.  When creating a Storage Space with storage tiers, the storage pool must have a sufficient number of SSDs and HDDs to support the selected storage layout. Stripping model (number of columns) has to be the same in the both tiers. Besides keeping the “hot” data on SSDs, storage tiering also creates Write Back Cache on part of SSD tier (default 1GB) which additionally improves the overall performance.

Examples:

4 SSD, 6 HDD, 2-columns 2-way mirror
The maximum number of columns we can have is determined by the number of disks in fewer tier. In this example the SSD tier has 4 disks and the maximum number of columns for two-way mirror is 2. The data will be stripped to two disks and then copied to another two disks. In HDD tier we have 6 HDDs, but the data will also be stripped to two disks and then copied to another two disks. HDD tier could perform better with the higher number of columns, but it has to follow the SSD tier layout.

Storage-Spaces-p2380-fig03

 Figure 3 – Tiered Storage Spaces with 4 SSD, 6 HDD and 2-columns 2-way mirror layout

2 SSD, 6 HDD, 1-column 2-way mirror
In this example the SSD tier has 2 disks and maximum number of columns is 1. No data stripping can occur is such a layout. Data is written to first SSD and then another copy on the other SSD. The same behavior occurs on the HDD tier. Performance of the HDD tier is equal to performance of a single HDD.

Storage-Spaces-p2380-fig04

 Figure 4 – Tiered Storage Spaces with 2 SSD, 6 HDD and 1-column 2-way mirror layout

Parallel rebuild

This is a feature that automatically rebuilds a Storage Spaces from the storage pool free space when a physical disk fails. Marketing says that you don’t need an extra disk for Hot spare, which is true, but you need to reserve the free space in the pool. Unprovisioned space size has to be the same or larger than a size of a single disk for this feature to work. This applies to the both tiers considering the size of disks in each tier. Keep in mind that each tier will rebuild only within that tier. Main advantage is rebuilding the data from the failed disk to multiple physical disks in the pool instead to a single hot spare. Full resiliency is achieved in a shorter time, while rebuilding a single large disk can be time consuming.

Storage-Spaces-p2380-fig05

  Figure 5 – Storage Spaces parallel rebuild

Storage Pool Expansion

If you want to expand a Storage Pool you have to follow the Storage Spaces layout. If you are using tiers each tier can be expanded independently. In the example below which is 2-columns 2-way mirror tiered Storage Space, you will need minimum of 4 SSDs or 4 HDDs to expand the each tier. For the layout shown on figure 4 only two disks in each layout are sufficient for the expansion.

Storage-Spaces-p2380-fig06

 

Figure 6 – expanding Storage Spaces 

Conclusion

There is no magic layout that fits all. Below are a few guidelines, which you may or may not choose to follow. They are based on my experience and it’s my view of how it should be done.

  • The more columns the better performance (bigger stripe).
  • Always follow the storage layout.
  • When using tiers use at least 4 SSDs if you want stripping.
  • Create a single Storage Space in a Storage Pool. Mixing Simple, Mirror, Parity or Dual Parity in the same Storage Pool is only for non-demanding workloads and the layout is hard to track.
  • Unpartitioned space has to be large enough for a parallel rebuild to work.
  • Test, test and test before you put anything into the production.