STB Suite

SATA SMART

What is SMART?

SMART (Self-Monitoring, Analysis and Reporting Technology)  is the method used by ATA drives (Both PATA and SATA) to report and log various performance and error conditions that occur as a drive is used. In the ATA world SMART performs the functions handled by LOG PAGES in the SCSI/FC/SAS world.

SMART data usually consists of various parameters which are typically compared to threshold values. Some examples of typical parameters are Power-On hoursStart-Stop CountReallocation Event Countand Raw Read Error Rate.

Looking at thresholds

SMART values typically start at a high number and are decremented. When the parameter value is equal to or less than the threshold for that parameter an error condition is declared. It the real world it is good to see when a SMART parameter is approaching its threshold, rather than just waiting for it to fail. The STB Suite deals with this in two ways. First, in the “original” mode when a SMART parameter is within 10% of its threshold value a warning will be displayed with the parameter in question being shown in amber. Once the threshold has been reached or exceeded it will be shown in red. In the multi-drive DMM mode you may specify how close you want a parameter to be able to approach to its threshold, and may screen a drive on that basis, as well as when a parameter exceeds its threshold.

Looking at absolute values

Some SMART parameters are absolute values and thus are not compared with thresholds. Parameters such as Power-On hours, or Start/Stop Cycles, and Device Temperature are to be viewed simply as counters or values. Thresholds do not make sense with this type of information, so simply logging or comparing with other similar drives is most useful. The DMM SMART Screening does allow you to screen a drive based on absolute values of any parameter you desire, so you may screen out drives with your own definitions.

For obvious reasons you can see that users are not allowed the ability to clear or change SMART values. Some drives may clear certain SMART parameters when new firmware is loaded – this is entirely a vendor-specific option.

The STB Suite breaks out all SMART parameters which should be looked at in terms of absolute values and shows them grouped at the bottom of its display. The example at the end of this article shows SMART data from a typical SATA drive.

Real World Issues

SMART data may be extracted, viewed, and acted upon in two ways with the STB Suite, depending on how your SATA drive is connected to your test computer.

Direct Native (Motherboard) SATA port

If your SATA drive is connected to a SATA port on your motherboard you will be able to view and save its SMART data via the STB Suite “original” mode – go to the top menu ATA/SATA->Commands->SMART Data menu choice, select your drive from the list of drives on the right, and its SMART data will be displayed and interpreted.

SAT-Compliant SAS controller

If your drive is attached to a SAT-compliant SAS controller use the Disk Manufacturing Module (DMM)SMART test type to extract and screen on SMART data.

Note: SMART data is only available from ATA drives – PATA or SATA. It is not available from SCSI, SAS, or FC drives. For these types of drives use the STB Suite Log Page functions.

An Example of SMART data

Output from ATA SMART Window

The time is Tue Feb 17 14:33:08 2009

Attribute 1 (** Raw Read Error Rate) – Flags = 0F, Current Value = 200, Threshold = 51, Worst = 200 Raw = (00 00 00 00 00 00)
Attribute 3 (Spin Up Time) – Flags = 03, Current Value = 245, Threshold = 21, Worst = 222 Raw = (85 12 00 00 00 00)
Attribute 4 (Start/Stop Count) – Flags = 32, Current Value = 100, Threshold = 0, Worst = 100 Raw = (4F 01 00 00 00 00)
Attribute 5 (** Reallocated Sector Count) – Flags = 33, Current Value = 200, Threshold = 140, Worst = 200 Raw = (00 00 00 00 00 00)
Attribute 7 (Seek Error Rate) – Flags = 0F, Current Value = 200, Threshold = 51, Worst = 200 Raw = (00 00 00 00 00 00)
Attribute 9 (Power-On Hours Count) – Flags = 32, Current Value = 80, Threshold = 0, Worst = 80 Raw = (AF 39 00 00 00 00)
Attribute 10 (Spin Retry Count) – Flags = 13, Current Value = 100, Threshold = 51, Worst = 100 Raw = (00 00 00 00 00 00)
Attribute 11 (Recalibration Retries) – Flags = 12, Current Value = 100, Threshold = 51, Worst = 100 Raw = (00 00 00 00 00 00)
Attribute 12 (Device Power-Cycle Count) – Flags = 32, Current Value = 100, Threshold = 0, Worst = 100 Raw = (46 01 00 00 00 00)
Attribute 190 (Airflow Temperature) – Flags = 22, Current Value = 53, Threshold = 45, Worst = 38 Raw = (2F 00 00 00 00 00)
!!! WARNING !!! attribute Airflow Temperature value 53 is approaching threshold 45
Attribute 194 (Device Temperature) – Flags = 22, Current Value = 253, Threshold = 0, Worst = 253 Raw = (2F 00 00 00 00 00)
Attribute 195 (Hardware ECC Recovered) – Flags = 0E, Current Value = 1, Threshold = 0, Worst = 1 Raw = (02 E9 4B 00 00 00)
Attribute 196 (** Reallocation Event Count) – Flags = 32, Current Value = 200, Threshold = 0, Worst = 200 Raw = (00 00 00 00 00 00)
Attribute 197 (** Current Pending Sector Count) – Flags = 12, Current Value = 200, Threshold = 0, Worst = 200 Raw = (00 00 00 00 00 00)
Attribute 198 (** Off-Line Scan Uncorrectable Sector Count) – Flags = 10, Current Value = 200, Threshold = 0, Worst = 200 Raw = (00 00 00 00 00 00)
Attribute 199 (Ultra DMA CRC Error Count) – Flags = 3E, Current Value = 200, Threshold = 0, Worst = 200 Raw = (0B 00 00 00 00 00)
Attribute 200 (Write Error Rate/Multi-zone Error Rate) – Flags = 09, Current Value = 200, Threshold = 51, Worst = 200 Raw = (00 00 00 00 00 00)

**********************

Attributes with raw data that may have meaning:
Attribute 4 (Start/Stop Count)  Raw Data (hex) = (4F 01 00 00 00 00), (dec) = 335
Attribute 9 (Power-On Hours Count)  Raw Data (hex) = (AF 39 00 00 00 00), (dec) = 14767
Attribute 12 (Device Power-Cycle Count)  Raw Data (hex) = (46 01 00 00 00 00), (dec) = 326
Attribute 194 (Device Temperature)  Raw Data (hex) = (2F 00 00 00 00 00), (dec) = 47
Attribute 195 (Hardware ECC Recovered)  Raw Data (hex) = (02 E9 4B 00 00 00), (dec) = 4974850
Attribute 196 (** Reallocation Event Count)  Raw Data (hex) = (00 00 00 00 00 00), (dec) = 0
Attribute 197 (** Current Pending Sector Count)  Raw Data (hex) = (00 00 00 00 00 00), (dec) = 0
Attribute 198 (** Off-Line Scan Uncorrectable Sector Count)  Raw Data (hex) = (00 00 00 00 00 00), (dec) = 0
Attribute 199 (Ultra DMA CRC Error Count)  Raw Data (hex) = (0B 00 00 00 00 00), (dec) = 11