Friday, February 1, 2008

Hard drive massacre

For some reason there has been a massacre of hard drives in Phnom Penh recently. Yesterday I replaced three(!!!) hard drives that were failing in the same way. The symptoms?

  • No data loss, no errors in error logs, just gradually decreasing performance
  • No strange noises, but loooooonnngggg hard drive access times. Minutes of activity where previously there had been less than a second, or maybe two.
  • Eventually (just before total death) a notice of a bad sector or two in the event logs.
What caused these three drives to all die with the same symptoms around the same time? Not sure, but I'm guessing that the extremely poor power regulation here could be part of the culprit. Or the massive amounts of dust EVERYWHERE. In any event, if you notice such symptoms on your drives, backup FAST, and not over old backups. I haven't had my clients do an exhaustive check of the data they got off the drives (in only one case was it actually critical) but I'm expecting that what is there is only partially intact.

For the record, ACM Queue has a great article about hard drive failure modes. Check it out if you want to get the gory details on why deaths like this happen. It ain't so uncommon, it seems.