If you are testing SATA or SAS rotating magnetic HDD’s or Solid State Drives (SSD) to be used in cloud storage or other trusted applications you must use the best testing tools available.
A recent Information Week poll illustrates the concern the marketplace has with cloud storage hardware
With customers demanding end-to-end data integrity, with pressure from consumer and government groups things such as HIPAA you must test with the goal of absolute data integrity to minimize the risk of data corruption. In addition your test solution needs to be able to stress I/O rates and report all I/O metrics needed. And with the fast evolving SSD technology of today your test solution must be versatile and adaptable to new issues and failure modes. In this paper we will compare the features of IOMeter versus the STB Suite and attempt to address these important points.
Block-Based versus File-Based testing
IOMeter is a file-based test, that is to say that IOMeter creates files on the Device Under Test (DUT) and executes I/O to and from these files.
On the other hand, The STB Suite is a block-based tester. It does not care or even know about file systems or structures; it does all of its I/O to individual blocks on the DUT.
What block-based testing provides for or allows is complex and powerful methods to insure that the data that the test wrote to any given block can be read back to insure that what was read is what was written. High-quality Block-based testing can provide 100% assurance of the data integrity of the DUT.
File-based testing only provides measuring I/O throughput – it does not measure data integrity.
Again we will stress that IOMeter is a file-based test method which cannot reveal data reliability problems, whereas the STB Suite is a block-based test method with complex and powerful features designed to reveal any data integrity issues.
A quick look at data integrity test features
At the simplest level these features let you confirm that if you write a data pattern to a given block and then read back that block that the data pattern compares byte-by-byte, that any overlays conform to your specifications, and that the data read did indeed come from the correct LBA.
Beyond simple data-compare – pattern overlays
Pattern overlays allow specific data to be embedded into the data written to each block. Overlays include the LBA number of the block, the seed used to generate random data pattern for the block, the time the block was written, and user-definable key values. These overlays are placed at the beginning and the end of each block during write testing.
This provides the assurance that if you write data to block 1000, then read block 1000, that the data was indeed retrieved from block 1000. This type of overlay confirms that caching and queuing are working as intended and not mixing data.
Time-stamping each block is a method where the exact time that each block was written is recorded in the block data.
This can be used to confirm that the writes in a test are actually making their way to the media. This is a
complex mechanism so we will give a simple example of how this could be used:
- First, a few hours before doing your tests you “prep” the drive by writing all blocks of the drive and include the timestamp data overlay.
- Wait a bit so that we have a good time delta
- Now do your testing, including the timestamp overlay in your data pattern comparisons.
- When the read portion of your test executes each block will be checked to insure that the timestamp indicates that your data has just recently been written, by noting that the timestamp is recent.
- If one of your writes fails, then the next read to that block or block range will return data that contains an “old” timestamp – showing that either the test write failed (and didn’t report a failure), or that the write went to cache and never made its way to the media, or that your read got its data from old cache.
As you can see this is a powerful test enhancement with many uses and possibilities.
You can assign a unique key to groups in your organization, or to multiple machines which are all connected to and testing the same storage system.
As with time-stamping, key-stamping can be used in some very creative ways, here is one example:
- You have three machines all connected to the same drive enclosure or drive.
- Each test machine is assigned a unique key
- Each test machine is set up to test its own block range, being sure to not overlap with the others
- During testing all writes will include the test machines unique key in each block of data.
- given test machine should never write into the wrong block range and should never read from the wrong block range.
If you get a key compare error it would indicate several problems:
- That the drives cache is faulty and is putting blocks of cached data to the wrong media blocks, or
- Some part of the storage system (dual ports, bridges, etc) is putting data in the wrong place, or
- That data is being silently lost during writes or reads
Data Pattern Compare
Being able to compare data is of course of utmost importance. A good test system will provide many different data patterns such as walking-1’s, walking-0’s, random data, or user-definable data. Perhaps a drive manufacturer has a unique data pattern which will stress his drives ECC correction to the max – this pattern can be used in the STB Suite as a user-defined data pattern.
A recent version of IOmeter has three available data patterns. The STB Suite has 13 patterns defined and is open-ended in that it allows any data pattern to be defined via the user-defined pattern.
Recent issues with SSDs have shown the importance of not being locked into a few data patterns for testing – the STB Suite has no such limitations. And of course, every data pattern in the STB Suite can be overlaid with LBA, time, random seed, and key values.
By “complexity” we do not mean “difficult to use”, or “high learning-curve”, which some people would agree are terms which can be used in describing IOMeter. When we say complexity we are talking in terms of versatility. Of being able to define and run tests made up of any number of steps, in one test session.
Here is an example from the world of SSD. A manufacturer of SSDs related that the steps that need to do for testing their products are:
- Format or erase the drive
- Write a known pattern to all blocks of the drive
- Write another known pattern to all blocks of the drive
- Execute random writes for some period of time, using 4K-per-IO transfers and record throughput metrics
- Execute more random writes but this time using 8K-per-IO transfers
- Get a summary of the results
Accomplishing this test sequence using IOMeter requires several distinct tests being run with different tools.
Accomplishing this test in the STB Suite Disk Manufacturing Module (DMM) is a simple matter of defining the 5 steps using a point-and-click menu, then selecting the drive(s) to run the test against. Period. The whole process, from defining the test steps to having the test running on as many drives as you wish takes all of 30 seconds. And that test can be saved so that another operator may recall the test and run it – that takes the time down to maybe 10 seconds!
If you need to format the drive between test steps – with STB Suite you can.
If you need to change blocks-per-IO – with STB you can.
If you need to manipulate an environmental chamber between tests – with STB you can.
If you need to loop your test sequence for 48 hours – with STB you can.
If you need to access vendor-unique manufacturing logs on your device during the test – with STB you can.
If you need to completely customize your tests to your exact needs – with STB you can.
What we are talking about is complexity coupled with versatility accomplished with simplicity.
The STB Suite does not tell you how you will test your device – in a simple point-and-click menu you tell it how you want to test.
You are the boss – your test software works for you. Not the other way around.