[ECCO] yet another hard drive failure

To all users of the cluster:


REASONS: /home storage unit has had 2 drive failures.

WHO WILL BE AFFECTED? All Users of the cluster


STATUS: NO further jobs can start as of NOW. A follow-up message will be sent upon completion.


We are terribly sorry for the inconvenience, and are looking into alternatives. Technical aside: our current hardware (Dell MD-1000) does not support RAID 6, only RAID 5. The two drive failures came in rapid succession. That means that even with a hot-spare drive (a drive not used for anything else except for an emergency like this), the filesystem would have failed. Newer hardware supports RAID 6, which allows for up to 2 drives to fail simultaneously. We are looking into what options there are with the current hardware in ECCO.

  • Lars Vilhuber

    UPDATE [2014-08-26]: We are working on restoring a functional /home directory. The best recoverable state of the home filesystem will also be made available. Mid-term, we expect to have a replacement storage system in place within 7-10 days. Sorry for the inconvenience.