STB Suite

Background Scan Operations

Introduction

Some information taken from the T10 SCSI Block Commands – 4 (SBC-4) documentation:

Many SCSI, SAS, & FC drives have a Background Scan feature, where when idle the drive performs read medium operations for the purpose of:

  1. identifying logical blocks that are difficult to read (i.e., recoverable) or unreadable (i.e., unrecoverable);
  2. logging problems encountered during the background scan operation; and
  3. when allowed, taking a vendor specific action to repair recoverable logical blocks or perform automatic read reallocation of recoverable logical blocks.

During a background scan operation, if a read medium operation encounters a recovered error (i.e., a logical block is readable but requires extra actions (e.g., retries or application of a correction algorithm) to be read), then the drive may resolve the problem using vendor specific means.

The value of the ARRE bit in the Read-Write Error Recovery mode page  determines whether or not the device server performs automatic read reassignment.

During a background scan operation, if a read medium operation encounters an unrecovered error (i.e., a logical block is unreadable), then the drive may mark the logical block unrecoverable.

The value of the AWRE bit in the Read-Write Error Recovery mode page determines whether or not the drive performs automatic write reassignment.

If the AWRE bit is set to one, then the drive performs automatic write reassignment at the start of the next write medium operation accessing that logical block.

During a background scan operation, the drive:

  1. may scan the logical blocks in any order (e.g., based on physical block layout);
  2. should not retain any data from logical blocks in cache memory after the logical blocks are read;
  3. shall ignore pseudo unrecovered errors with correction disabled; and
  4. shall process pseudo unrecovered errors with correction enabled.

 

As you can see, the background scan functionality can be a very useful feature for insuring the highest degree of data integrity.

For even more details have a look at a Seagate patent on the subject here –

http://www.google.com/patents/US7490261?dq=7490261

Basically the purpose of background medium scanning is to catch and correct hard or soft errors before the questionable blocks are actually used.

Determining that a particular block has a correctable or uncorrectable error takes time. Depending on how the Error Recovery Mode Page is configured, the discovery and correction of a bad block could possibly take several seconds as the drive tries error-correction (ECC) methods, which are relatively fast, or if those fail tries one or more retries. Either way, the process takes a relatively long time to complete – so doing the process beforeactual user data is written or read can help insure that real user data requests will not suffer these delays.

Very clever!

Since these background scans are run in the background, while the drive is otherwise idle, there is relatively little performance degradation. The only real time there could be an impact on performance is if the background scan were processing in one part of the drive when a Read or Write command comes in for a different part of the drive – requiring an initial additional seek. Since the background scans are triggered by timers which wait for the drive to have been idle for a specified time the actual real-world impact on performance is negligible.

If you have high-value data on the drive it is well worth enabling background media scanning.

 

Basic Settings

The behavior of a drives Background Media scan is determined by the settings of the various parameters of the Background Control Mode Page

– Mode Page 0x1C Sub-Page 1

 

Here is a representation of this Mode Page and its parameters –

And here is how this Mode Page is viewed and edited in the STB Suite-

First, here’s how to choose Sub-Page 1 of Mode Page 0x1c –

 

 

And here is the actual Page/Parameters –

Changing Settings

A quick refresher in dealing with Mode Page/Parameter settings…

Each Mode Page Parameter has a Mask value, which defines which bits in a given Byte may be set. If there is a mask of 0 then no bits in that byte may be set – in other words you cannot change that byte.

If OTOH you see a mask of 65535 (0xFF)you know that you can set any value into that byte.

Keep this in mind as you set the parameters.

The Enable/Disable bits

The four values at the top of the Mode Page display are bits which determine what Background Scan features are enabled or not. Here are the actual definitions of these parameters –

The S_L_FULL bit

0 A suspend on log full (S_L_FULL) bit set to zero specifies that the device server shall continue running a background scan operation (see SPC-4) even if the Background Scan Results log page  contains the maximum number of Background Scan log parameters  supported by the logical unit.

1 A S_L_FULL bit set to one specifies that the device server shall suspend a background scan operation if the Background Scan Results log page contains the maximum number of Background scan log parameters supported by the logical unit.

The LOWIR bit

0 A log only when intervention required (LOWIR) bit set to zero specifies that the device server shall log all suspected recoverable medium errors or unrecoverable medium errors that are identified during background scan operations in the Background Scan Results log page.

1 A LOWIR bit set to one specifies that the device server shall only log medium errors identified during back ground scan operations in the Background Scan Results log page that require application client intervention.

The EN_BMS (Enable Background Medium Scan) bit

0 An enable background medium scan (EN_BMS) bit set to zero specifies that background medium scan is disabled.

1 An EN_BMS bit set to one specifies that background medium scan operations are enabled. If the EN_PS bit is also set to one then a background medium scan operation shall not start until after the pre-scan operation is halted or completed. If a background medium scan operation is in progress when the EN_BMS bit is changed from one to zero, then the logical unit shall suspend the background medium scan operation before the device server completes the MODE SELECT command, and the background medium scan shall remain suspended until the EN_BMS bit is set to one, at which time the background medium scan operation shall resume, beginning with the logical block being tested when the background medium scan was suspended.

The EN_PS (enable pre-scan) bit

0 An enable pre-scan (EN_PS) bit set to zero specifies that pre-scan is disabled. If a pre-scan operation is in progress when EN_PS is changed from a one to a zero then pre-scan is halted.

1 An EN_PS bit set to one specifies that a pre-scan operation is started after the next power on cycle. Once this pre-scan has completed, another pre-scan shall not occur unless the EN_PS bit is set to zero, then set to one, and another power on cycle occurs.

The Interval fields

 

The BACKGROUND MEDIUM SCAN INTERVAL TIME field

The BACKGROUND MEDIUM SCAN INTERVAL TIME field specifies the minimum time, in hours, between the start of one pre-scan or background medium scan operation and the start of the next background medium scan operation. If the current background medium scan operation takes longer than the value specified in the BACKGROUND MEDIUM SCAN INTERVAL TIME field, then the logical unit shall:

  1. continue the current background pre-scan operation or background medium scan operation until that background scan operation is complete; and
  2. the logical unit shall start the next background medium scan operation starts upon completion of the current background pre-scan operation or background medium scan operation.

The BACKGROUND PRE-SCAN TIME LIMIT field

The BACKGROUND PRE-SCAN TIME LIMIT field specifies the maximum time, in hours, for a background pre-scan operation to complete. If the background pre-scan operation does not complete within the specified time then the device server shall halt the background pre-scan operation. A value of zero specifies an unlimited timeout value.

The MINIMUM IDLETIME BEFORE BACKGROUND SCAN field

The MINIMUM IDLE TIME BEFORE BACKGROUND SCAN field specifies the time, in milliseconds, that the logical unit shall be idle after suspending a background scan operation before resuming a background scan operation (e.g., after the device server has completed all of the commands in the task set).

The MAXIMUM TIME TO SUSPEND BACKGROUND SCAN field

The MAXIMUM TIME TO SUSPEND BACKGROUND SCAN field specifies the time, in milliseconds, that the device server should take to start processing a command received while a logical unit is performing a background scan operation.

 

 

Viewing Background Scan Results

The results of all Background Scans are stored in Log Page 15 – the Background Scan Results Page.

Here is how to view this page in the STB Suite –

Right-click on the drive of interest, then choose View Log Pages.

If what you see looks like this –

Then check the Use .def File box and choose the file Default.def – now you will see this –

 

Double-click on Page 15 – Background Scan Results Page and its contents will be displayed on the right.

Some information of interest –

Accumulated Power On Minutes – The ACCUMULATED POWER ON MINUTES field indicates the number of minutes the device server has been powered on since manufacturing.

Background Scan Status – the available descriptions for this are:

BACKGROUND MEDIUM SCAN PROGRESS field

The BACKGROUND MEDIUM SCAN PROGRESS field indicates the percent complete of a background scan operation in progress. The returned value is a numerator that has 65 536 (i.e., 1_0000h) as its denominator. If there is no background scan operation in progress (i.e., no background scan operation has been initiated since power on or the most recent background scan operation has completed), then the device server shall set the BACKGROUND MEDIUM SCAN PROGRESS field to 0000h.

NUMBER OF BACKGROUND MEDIUM SCAN PERFORMED field

The NUMBER OF BACKGROUND MEDIUM SCAN PERFORMED field indicates the number of background medium scan operations that have been performed since the SCSI target device was shipped by the manufacturer. If the NUMBER OF BACKGROUND MEDIUM SCAN PERFORMED field contains 0000h, then the number of background medium scan operations is not reported.

If an error occurred during a background scan there will be additional information in the Log, described in Background Scan Parameters. Here is an example of a drive which found a bad block during a background scan –

This particular drive has 35 of these Background Scan Parameters.

Note the detail provided – very valuable information!

Summary

The Background Scan options for SAS, SCSI, & FC drives is a very powerful and valuable tool for insuring high levels of data integrity. Knowing how to configure these scans and how to interpret the results will show your value to your customers!