Skip to main content

Archives

A sample text widget

Etiam pulvinar consectetur dolor sed malesuada. Ut convallis euismod dolor nec pretium. Nunc ut tristique massa.

Nam sodales mi vitae dolor ullamcorper et vulputate enim accumsan. Morbi orci magna, tincidunt vitae molestie nec, molestie at mi. Nulla nulla lorem, suscipit in posuere in, interdum non magna.

[ECCO] yet another hard drive failure

Print Friendly, PDF & Email

To all users of the ecco.vrdc.cornell.edu cluster:

EMERGENCY DOWNTIME: NOW

REASONS: /home storage unit has had 2 drive failures.

WHO WILL BE AFFECTED? All Users of the ecco.vrdc.cornell.edu cluster

WHAT WILL BE UNAVAILABLE? ecco.vrdc.cornell.edu & compute nodes

STATUS: NO further jobs can start as of NOW. A follow-up message will be sent upon completion.

QUESTIONS: ecco-help@cac.cornell.edu

We are terribly sorry for the inconvenience, and are looking into alternatives. Technical aside: our current hardware (Dell MD-1000) does not support RAID 6, only RAID 5. The two drive failures came in rapid succession. That means that even with a hot-spare drive (a drive not used for anything else except for an emergency like this), the filesystem would have failed. Newer hardware supports RAID 6, which allows for up to 2 drives to fail simultaneously. We are looking into what options there are with the current hardware in ECCO.

1 comment to [ECCO] yet another hard drive failure

  • Lars Vilhuber

    UPDATE [2014-08-26]: We are working on restoring a functional /home directory. The best recoverable state of the home filesystem will also be made available. Mid-term, we expect to have a replacement storage system in place within 7-10 days. Sorry for the inconvenience.