bad blocks and surface scans

Discussion in 'DIY Computers' started by if, Feb 24, 2006.

  1. if

    if Guest

    SMART reports for my 120GB Deskstar started to show about 1 new remapped
    sector a month 6 months ago. Today a suspicious crash led me to do a
    surface scan (read only) after which I found two more sectors had been
    mapped out since I last did a SMART 4 days ago. A second surface scan
    immediately after the first resulted in two more. (Scandisk reports no
    errors and 0 bad sectors each time so all this is being done at drive
    level.)

    During the surface scans I noted that disk read rate falls from a steady
    24MB/sec to a hugely varying rate of around 5-10MB/sec for a few hundred
    thousand clusters (170,000-600,000) before settling back to the higher
    speed (the partition has 1.3 million 4KB clusters). Some of this slowdown
    is clearly due to repeated attempts to read some poor sectors but I wonder
    if it's also partly the heads having to seek to access the already remapped
    sectors (of which there are now 14), since the slowdown continues for such
    a long time. Surely there can't be a whole gigabyte or two of dodgy sectors
    can there?

    Assuming the answer to that is no, is it worth running the surface scan
    repeatedly to try and provoke the drive into remapping all the iffy sectors
    (maybe doing the slower read/write scan rather than just read), or might
    this somehow make the drive fail quicker? The problem area seems to be only
    on the primary 5GB boot partition of the 120GB drive, no other partitions
    exhibit the patchy disk read speeds that the boot partition shows. But
    might the problem spead (e.g dust floating around or something?)

    Is a dozen or so remapped sectors a lot? The rate of decay is alarming me
    but the fact that they are still under scandisk's radar and seemingly
    located in one area of the disk makes me wonder if there's still life left
    in the drive. I was going to get a new drive anyhow as this one's full, but
    I wanted to keep this one for backups at least, as it's under 3 years old
    and I don't really want to junk 120GB of storage if there's a chance I can
    keep it alive.
     
    if, Feb 24, 2006
    #1
    1. Advertisements

  2. if

    Mike Guest

    Can't answer your question at the moment but do you mind telling me which
    tool you use to read out this SMART info?

    Thanks, Mike
     
    Mike, Feb 24, 2006
    #2
    1. Advertisements

  3. Run some of the freeware SMART tools and see if any of them reports
    imminent failure. Your BIOS may have a SMART status check built in, but
    disabled. Check the CMOS settings.

    It does sound to me though that your drive is on the way out. Run the
    maker's diagnostic (in your case, it's IBM/Hitachi's DFT) and run the
    full read/write scan - THIS WILL DESTROY YOUR DATA.

    As for using the drive for backups, well, do you feel lucky?
     
    Mike Tomlinson, Feb 24, 2006
    #3
  4. if

    if Guest

    SmartUDM. This is a sample report for the dodgy disk, it first shows
    percentage values and projected failure dates, then the actual hex values
    for the same parameters (needs fixed width font to view). The failure dates
    tend to oscillate somewhat and so aren't very plausible: it seems to do a
    straight line extrapolation between the current values and those obtained
    the first time SMARTUDM was run (in this case 2.5 years ago). For instance
    it was predicting failure by 2010 only last month, now it says 2025 despite
    a marked increase in bad sectors. A plot which took account of intermediate
    values would be better but it doesn't store any values other than the very
    first set it took. Still, the absolute values in the second table are
    useful, and SMARTUDM shows the failure thresholds to give you something to
    compare the current values against.


    SMARTUDM - HDD S.M.A.R.T. Viewer 2.00
    Copyright (C) 2001-2003, Sysinfo Lab
    Copyright (C) 1997, Michael Radchenko
    www.sysinfolab.com e-mail:

    þ HDD 1 Model: IC35L120AVV207-0
    þ HDD 1 Size: 117800 Mb (115.04 Gb)
    þ Location: Quaternary Master
    þ Controller Revision: V24OA66A
    þ Buffer Size: 1821.5 kb
    þ Compatibility: ATA/ATAPI-6 revision 25
    þ PIO Mode Support: 4
    þ SW DMA Mode Support: None
    þ MW DMA Mode Support: 2, Active: None
    þ UDMA Mode Support: 5 (UltraDMA/100), Active: 5
    þ Current AAM Value: FEh (80h recommended) - disabled
    þ S.M.A.R.T.: [û] enabled
    þ SMART Self-test: [û] enabled
    þ SMART Error Logging: [û] enabled

    þ T.E.C. prediction monitoring started at: 28-07-03, 23:18:54
    Attribute ID Threshold Value Indicator 1/Month T.E.C.
    ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
    Ä
    * Raw Read Error Rate 1 60 99 þþþþþþþþþþ 0.0 Unknown
    * Throughput Performance 2 50 100 þþþþþþþþþþ 0.0 Unknown
    * Spin Up Time 3 24 93 þþþþþþþþþ 0.3 Oct-2025
    Start/Stop Count 4 0 100 þþþþþþþþþþ 0.0 Unknown
    * Reallocated Sector Count 5 5 100 þþþþþþþþþþ 0.0 Unknown
    * Seek Error Rate 7 67 100 þþþþþþþþþþ 0.0 Unknown
    * Seek Time Performance 8 20 100 þþþþþþþþþþ 0.0 Unknown
    Power On Hours Count 9 0 98 þþþþþþþþþþ 0.1 Unknown
    * Spin Retry Count 10 60 100 þþþþþþþþþþ 0.0 Unknown
    Drive Power Cycle Count 12 0 100 þþþþþþþþþþ 0.0 Unknown
    Power-Off Retract Cycle 192 50 98 þþþþþþþþþþ 0.1 Oct-2067
    Load/Unload Cycle Count 193 50 98 þþþþþþþþþþ 0.1 Oct-2067
    Drive Temperature 194 0 161 þþþþþþþþþþ 0.1 Oct-2067
    Reallocation Events Count 196 0 100 þþþþþþþþþþ 0.0 Unknown
    Current Pending Sector 197 0 100 þþþþþþþþþþ 0.0 Unknown
    Uncorrectable Sector 198 0 100 þþþþþþþþþþ 0.0 Unknown
    UltraDMA CRC Error Rate 199 0 200 þþþþþþþþþþ 0.0 Unknown
    ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
    Ä
    NOTE: "*" means life-critical attribute
    þ T.E.C. not detected.
    þ Nearest prognosed T.E.C.: Oct-2025, Spin Up Time (Critical)

    Attribute ID Threshold Value Worst Raw Type
    ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
    Ä
    * Raw Read Error Rate 1 60 99 99 000000010002h ER
    * Throughput Performance 2 50 100 100 000000000000h PR
    * Spin Up Time 3 24 93 93 0006012D0121h PR
    Start/Stop Count 4 0 100 100 0000000006ADh EC
    * Reallocated Sector Count 5 5 100 100 00000000000Eh EC SP
    * Seek Error Rate 7 67 100 100 000000000000h ER
    * Seek Time Performance 8 20 100 100 000000000000h PR
    Power On Hours Count 9 0 98 98 000000003BF5h EC
    * Spin Retry Count 10 60 100 100 000000000000h EC
    Drive Power Cycle Count 12 0 100 100 000000000352h EC SP
    Power-Off Retract Cycle 192 50 98 98 000000000B26h EC SP
    Load/Unload Cycle Count 193 50 98 98 000000000B26h EC
    Drive Temperature 194 0 161 161 0037000E0022h
    Reallocation Events Count 196 0 100 100 000000000011h EC SP
    Current Pending Sector 197 0 100 100 000000000000h SP
    Uncorrectable Sector 198 0 100 100 000000000000h ER
    UltraDMA CRC Error Rate 199 0 200 200 000000000000h ER
    ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
    Ä
    NOTE: "*" means life-critical attribute
    Attribute types:
    PR - Performance-related ER - Error rate
    EC - Events count SP - Self-preserve

    þ Reallocated Sectors: 14
    þ Current Temperature: 34 C
    þ Drive Power Cycle Count: 850
     
    if, Feb 24, 2006
    #4
  5. if

    if Guest

    See other posting - failure is not predicted until 2025, however to put
    that into context, back in Sept 2003 SMARTUDM did predict failure by Oct
    2004 due to read error rate. This date gradually receded to 2014 and has
    now mysteriously vanished off the prediction scale altogether.
     
    if, Feb 24, 2006
    #5
  6. if

    Fishman > Guest

    Re SmartUDM from www.sysinfolab.com

    Downloaded it - shame all the support text files with the exe are in Chinese
    ??

    The exe don't seem to do much either.
    Do you have to run it from DOS?
     
    Fishman >, Feb 24, 2006
    #6
  7. if

    Rob Hemmings Guest

    I agree with Mike - you really do need to run their own diagnostic
    (Drive Fitness Test) at this point. Start with the Quick tests and see
    what that throws-up:
    http://www.hitachigst.com/hdd/support/download.htm#Drive fitness
    User guide:
    http://www.hitachigst.com/downloads/dft32_user_guide.pdf
    BTW, this will run with most other brands of HD too and is possibly
    the best 'free' HD diag. tool out there.
    HTH
     
    Rob Hemmings, Feb 24, 2006
    #7
  8. if

    if Guest


    It is a DOS program, it runs fine in a DOS box on my machine (Win98). I
    just have a PIF file for it saying to run in a normal window and close on
    exit. Possibly it may not work on XP et al. On my machine it starts by
    commenting that it's being run under Windows (not sure why), then gives a
    list of all the drives and asks me to pick the one I want a report on.

    I use it because it can see the disks on my PCI ATA card which the native
    Windows apps I tried couldn't (at least not ones that gave useful data -
    some SMART utilities seem to say nothing other than "everything is fine"
    unless failure is imminent, which is not the sort of program I want).

    The only problem I've found is that if a drive is asleep it doesn't wake it
    up and Win98 hangs, waiting for a response from the drive which never
    comes. So you *must* wake any sleeping drives with Explorer or something
    before running this program!

    Another program which can give similar reports to SMARTUDM is DTEMP which
    sits in the system tray. Unlike SMARTUDM this is a proper Win32 program.
    However it can't see the drives on a SCSI or PCI ATA card so I stopped
    using it. You can get it from http://private.peterlink.ru/tochinov/
    As well as being able to run off reports, it monitors your drive temp and
    and SMART status at user definable intervals (e.g. every 5 mins) and issues
    an alert if temp rises about a threshold you set. If your drives are
    motherboard PATA ones and you have cooling issues I think it could be very
    useful. It hasn't been updated since 2002 so don't expect a SATA version
    any time soon! A SCSI/ATA card version was promised in 2001 but never
    emerged.
     
    if, Feb 24, 2006
    #8
  9. if

    if Guest

    Thanks, I'll take a look at that.

    My dilemma really is that although the bad sectors have been developing at
    what to me is an alarming rate (5 in five days), this doesn't even register
    as a blip on SMART's radar - the reallocated sector value is still 100%.

    Even if we assume I'm only one bad sector away from falling to 99%, the
    implication -- given that the failure threshold is 5% for this parameter --
    is that I would need to have 1400 sectors mapped out before SMART
    considered failure to be imminent, which at this rate would take another 4
    years. So I wonder if I'm worrying unduly when only one hundredth that
    number of sectors have failed. Anyhow I'll certainly keep the drive well
    backed up, and will see what the above diagnostics prog. has to say.
     
    if, Feb 24, 2006
    #9
  10. The message <Xns9774D3FBAFABDvtqj3@216.196.109.145>
    This should only be problem for laptop use. Anyone using a mains
    powered machine should have their wrists slapped for forgetting to
    change the default hard drive timeout to "Never".
    I've been using DTemp for the past 18 months or so with SATA and it
    seems to work, but I've always had my suspicions regarding the
    temperature reporting since it very rarely split the temperature
    readings between the two SATA drives in spite of the upper drive in the
    bay proving to be running a good 5 to 6 deg hotter. It would seem that
    DTemp accurately reports the temperature of the first drive and applies
    it to both.

    I've since fitted an extra drive bay and seperated the drives as well
    as providing an auxilliary cooling fan to direct airflow around the well
    spaced drives to eliminate the temperature discrepency so it's become
    rather academic that only one of the drives is used as the source for
    the temperature readings.

    HTH
     
    Johnny B Good, Feb 25, 2006
    #10
  11. Rather delayed response, but I can assure you that DTemp will give you
    different readings for different drives. When I had two Maxtors in my
    main system, a PATA and a SATA, both were reading different (with the
    SATA reading around 5°C hotter). Now I've got the SATA Maxtor and a
    SATA Seagate I'm still getting two different readings (with the Maxtor
    still in the lead, despite it being quite aggressively cooled).

    Now, I'll grant you I've never tried with two identical drives on two
    identical controllers, but I can't see why it wouldn't work.
     
    Gareth Halfacree, Feb 28, 2006
    #11
  12. Motherboard Monitor does too.
    Same here. 80Gb Seagate Barracuda 4 PATA and 160Gb Seagate 7200.7 SATA
    here. The SATA unit always reads a good few degrees higher than the
    PATA. (SATA=30C, PATA=22C at the mo.) As long as it stays under 40C,
    I'm not concerned.

    Both drives are in a 3.5" bay, with two inches separation and two fans
    providing airflow. Swapping the drives around in the bay doesn't change
    the temperature readings, so I think it's inherent to their design.
     
    Mike Tomlinson, Feb 28, 2006
    #12
  13. The message <>
    That hotter running temp of the Maxtors, in combination with woefully
    inadequate cooling provision in most case designs, is, in spite of
    Maxtor's very high maximum operating temperature limit of 60 deg C, very
    probably the main reason for the seemingly high failure rates being
    reported in usenet groups.

    Samsung specify the temperature limits for their Spinpoint range in
    what I thought was a very peculiar way, saying that the _optimal_
    temperature range (my emphasis) was 5 to 55 deg C.

    That statement provoked the thought, "What! All of it? 5 to 55 deg is
    _ALL_ optimal? As in 55 deg C is an optimal temperature?".

    Sadly, it would seem that just one or two degrees above _that_ optimal
    temperature range was enough to cause damage, as I discovered after
    running a high environmental temperature test whilst ignorant of the
    fact that the second drive was actually running hotter by a good 5 to 6
    deg C than DTemp was reporting for _both_ drives.
    I suppose it might be a SATA interface chipset issue, mine's a VIA
    chipset MoBo with a VIA SATA controller.

    The funniest part of this dearth of temperature splits was that there
    never were any whilst I had 120GB and 160GB Spinpoint drives fitted and
    it was only when I upgraded the 120GB to a 160GB to give me a pair that
    I started to see the occasional split reading.

    Now I'm running a pair of 320GB Western Digitals and only rarely spot
    any splits (but I have taken measures to keep both drives equally cool).

    The temperature readings do at least track the room temperature in that
    they vary between 8 to 10 deg C higher suggesting that the readings are
    a close approximation to the actual temperature of at least one of the
    drives.

    I'm not prepared to take it as 'Gospel' that the readings are valid for
    both drives but I am now fairly confident that they are no more than a
    degree or two apart from DTemp's reports and thus feel pretty confident
    that my system is now fully capable of operation under maximum stress at
    environmental temperatures up to and including 40 deg C.
     
    Johnny B Good, Feb 28, 2006
    #13
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.