https://www.anandtech.com/show/14596/toshiba-western-digital-nand-production-partially-halted-by-power-outageI didn't call it 6 EB in the title because people wouldn't recognize what that means. Anyway, it's estimated that it is about two weeks worth of the global supply of NAND that was destroyed, or half of what Toshiba and Western Digital will produce in a quarter. This is likely to result in higher SSD prices than there would be otherwise for a while, though nothing like the really high hard drive prices that happened some years ago due to flooding in Thailand. This affects only one of the four major NAND flash vendors.
Apparently a 13 minute power outage is all that it took to do that. If it shuts down a bunch of equipment in the process of various stages of producing chips, that could destroy all of the chips that were in process at the time.
Comments
Second, I'm not sure just how much power they use at that site, but I'd be shocked if it's not well into the megawatt range. Having battery backups capable of cranking out some megawatts of power for very long times would cost a fortune. At some point, you're better off just accepting a risk of a considerable production loss.
Third, fabs are clean room facilities that must be kept pristine. A speck of dust too small to be seen by the unaided human eye is plenty enough to do serious damage, and possibly make an entire chip unusable. So some power generation options simply aren't viable.
I once read a story (in a comment section, so possibly apocryphal) about a fab decades ago that had a bunch of generators on hand in case of a lengthy power outage. When the big power outage finally hit, they fired up the generators, and it looked like everything was good. Then not long after that, they noticed that their yields were awful. After investigating, they found that fumes from the generators had caused a lot more damage than just letting the power go out would have.
Battery banks are there for things to last long enough to put something like the above into action.
EDIT: On second thought, it's also possible that they didn't have backup for anything longer than evening out a power spike. I remember reading about one industrial location where instead of building backups they had built next to a major high voltage line transferring electricity from several hydro dams. That high voltage line's last outage had been in 1970s.
And no, it almost certainly wouldn't be hydro storage. Almost all backup power globally is in the form of diesel generator sets, with various forms of UPS storage (flywheel, lead acid, lithium, etc) in front to to give you enough time to start and transfer load over. Lithium batteries are making an inroads, but they have a very long way to go to be competitive versus a diesel set for standby power. UPS power on this scale is typically measured in seconds, not minutes.
A plant this size would certainly be megawatts in size. Maybe tens of megawatts. Probably not hundreds of megawatts though. Hydro tends to be in the range of 50-2,000 MW, and it's only really cost effective if you have the right location and can look at it over decades of service. A chip fab isn't going to be looking at decades for a power source. It is possible they pay for two independent utility feeds (and that isn't uncommon for larger facilities as a form of backup), and one of those utility feeds sources from Hydro, but that isn't the same thing as WD or Toshiba owning and operating the dam themselves.
That being said - even with a backup, no power source is 100% reliable. Typical utility is already 96-98%, add a UPS and your up to 98-99%, add a diesel and your at 99.5%. You can keep paying a whole lot of money to keep adding to the decimal point, but you will never hit 100%.
With that though, given the market conditions right now, I tend to think it was an "accident", rather than a real accident. The players in this game are just too shady and have been pulling this for too long for me to really think it was anything else.
"We all do the best we can based on life experience, point of view, and our ability to believe in ourselves." - Naropa "We don't see things as they are, we see them as we are." SR Covey
https://www.anandtech.com/show/14594/micron-shipments-of-3d-qlc-for-ssds-nearly-double-qoq-as-wafer-starts-cut-again
And they're going to do it without halfway producing and then ruining a bunch of wafers, or screwing up equipment so that you have to take a ton of effort to recalibrate it, or a number of other expenses that Toshiba and Western Digital are going to have to deal with.
An intentional "accident" makes no sense at all unless it's covered by insurance--in which case, it would be criminal fraud. And I'd be skeptical about insurance companies being willing to underwrite this at all, or at least not without a ton of measures to detect and catch fraud.
A large fab uses really lot of power. Using Google I was able to find out figure of 100 MW.
That area has 5 fabs. Assuming 100 MW per fab, they'd need 500 MW of backup power.
Backup power is usually generated with large diesel generators, for example these 7m x 3m x 2,5m generators:
To generate 500 MW they'd need to have 200 generators like this running simultaneously.
In the comments to the article linked from my original post, one of the AnandTech contributors says that Global Foundries has 3 minutes worth of backup power on hand to keep things running in case of a power outage. An outage longer than that just means that a bunch of parts are fried. As they're a pure-play foundry that produces custom chips for customers, not a commodity like NAND, they don't even benefit from higher prices resulting from a shortage due to a bunch of production being destroyed.
If he lost the rig because there was a power spike in the network then warranty doesn't cover those.
More seriously, even if something is covered by warranty, it's better not to need the warranty. It can also be hard to figure out what broke when a computer doesn't work.
After that, the rig would run normally but once every blue moon it would freeze yet again. So, I tried switching the GPU (it was either that or the harddrive) and it worked. I'm not sure which component was at fault but an almost complete replacement was necessary for mine to work again. I even switched my keyboard and mouse for an upgrade as an excuse on the whole situation :P
A power outage is much less destructive because normally there's only a sudden power loss, and the device only has to be build so that it never breaks itself even if it suddenly loses all power.
I just mentioned it just in case his friend has mistaken the power outage with what happened to me.
In 21st century a lightning cannot get inside your computer PERIOD. (Unless it goes through the LAN cable if you are stupid enough not to put a router or a switch before your PC, but fiber optics is a commodity these days)
You have power company fuses that also track your power usage. You have household fuses that would fry long before the lightning reaches your computer, and even if it somehow arcs over the air and moves on through the fuses (that's borderline impossible), regular and cheap power outlets have built-in fuses in case overvoltage happens.
If that's not enough then your power supply will be toast, but it's borderline impossible for the supply to release that voltage into the system, granted it's of a higher quality. Certified PSU's have tons of protections.
Back when we didn't have fiber optics I got used to buying a new router every summer because every once in awhile a lighting would hit the street pole and the electricity would flow from the pole to the router and it will be toast. No damage to the computer whatsoever.
But lightning traveling through the power grid? Give me a break.
To keep it short, he replaced what he could, I highly doubt you needed RAM change AT ALL. Faulty ram is hard to catch, it leads to blue screens, not freezes. You have to run Memtest86+ on it to scan every single memory bit for reading and writing. I've had faulty memory sticks it takes hours on some to even detect it.
Probably you were unlucky and you had a defective GPU. Either that or your GPU needed some BIOS reflashing (I've heard that this helps). Either way it was RMA from the start.