With last month's cover story, "Crash-Proof Computing," we
put a stake in the ground for reliable computing. In the
past, we've been as guilty as anyone of being dazzled by
clock speeds and feature sets. These are important, but
reliability has become a critical issue, and at BYTE we're
not closing our eyes to it anymore. In the aforementioned cover story, senior editor Tom Halfhill took a balanced look at the whys and wherefores of today's woefully crash-prone PCs. At the top of this page, it says "Editorial," so I don't have to be balanced. In my opinion, there's absolutely no excuse for computing that requires multiple reboots each day. The typical excuses sound awful whiney to me: Everything's so complex; people want so many features; it's a competitive market. It's oft been pointed out that only in computing do we tolerate the shoddy work that passes for mainstream operating systems and apps. And I'm hard-pressed to think of any other piece of hardware you can buy for $3000 that's as failure-prone as a PC. Sure, the culprits are often the cut corners that make PCs affordable: the cheap video card, the marginal memory chip, the inexpensive drive controller. But how often do PCs fail because of bad cooling, improper voltage regulation, or other fundamental design flaws? I'm betting pretty often. Preventing those failures with better components would not be that expensive. For example, while it's nice that disk drive capacities have been on a serious price/performance ramp-up, I'm sure we'd all take a slightly less steep ramp if it were combined with increasingly rugged and foolproof drives. Yet several years ago, some in the drive industry adopted the reverse tack. They said: Who needs MTBFs of hundreds of years when computers are obsolete in four? Yes, but that M stands for mean; that average is a combination of happy clams and less fortunate individuals whose hard drives fail in the first year of operation. So, no more excuses. Time to get serious about quality. I'm mad as hell and I'm not going to take it anymore! I have some definite ideas about where computing has to go to become more reliable. I'll be on-line May 11-15 (http://www.byte.com/discuss/discuss.htm) to talk about them with you. We'll put together a Manifesto for Reliable Computing and lobby for it in the industry. As a first step, we'll add to our awards at PC Expo and Comdex and our annual Editors' Choice Awards to recognize products that contribute to this cause, if and when they appear. So, let me throw down the gauntlet and get the dialog started: 1. All OS and application installs should have absolute rollback capabilities, for multiple generations. The PC will never live in as small a universe as the mainframe, where packaged software tends to be much better behaved. In such a diverse world, rollback at least preserves workability. 2. General-purpose computers need some real-time capabilities. Specifically, what good is network administration if a computer is so locked up that it can't communicate over the Net? All operating systems should keep a channel open to network admin traffic. 3. All OSes should be self-healing. If any component of the OS becomes corrupted or fails (due to version conflicts, for example), the OS should know that and take any appropriate actions. 4. All computers should monitor their own temperature and power consumption (for high and low voltage) and warn of out-of-limit operations. 5. All peripherals should run a POST-like diagnostic at start-up and have an invocable diagnostic routine that can be run when trouble occurs. Beyond such a list of engineering principles, we should all exercise stern judgment in purchasing. Replace that smile with a frown when vendors say they produced their app on "Web time" what they mean is that it's late alpha code and you're the beta tester. We all complain about buggy software and unreliable hardware now let's put our money where our mouth is. Mark Schlack, Editor in Chief, mschlack@bix.com Copyright 1990-1998 BYTE |