Capacitor FAIL and other hardware lessons

I remember well in the mid 1990s how a professor of physics demanded that a university save money purchasing computers. His theory was that one or maybe even two extra PCs would be available in a lab with money saved.

The problem with his theory was that the less-expensive computers experienced a high rate of malfunction and failure. The computers were purchased specifically to perform lab work using devices connected to a serial port. The serial port depended on a 16550 Universal Asynchronous Receiver/Transmitter (UART) chip.

At that time Gateway 2000 was saving money by using the least expensive parts available. An order of fifteen PCs could end up with fifteen different UART brands and/or versions, many of which would fail under load. More specifically we suspected that a single character would get left in a shift register and one in the holding register; the character then would not transmit and give no interrupt or alert. System failure.

It was not possible to determine through software the revision of the chip installed so drivers could not compensate and adapt to this problem. The solution, at that time after meetings and evaluation of PC vendors, was to dump the Gateway investment and purchase Dell “business-line” computers — the OptiPlex. Dell offered the university a guarantee of chip quality control and consistency, which actually turned out to be the case for the UART.

The bottom line was that more money was saved by high availability in just one semester than by the lower initial capital investment.

Apparently the same could not be said for capacitors.

Engadget does not mince words in a recent report regarding the OptiPlex:

Dell asked customer service reps to deny there was any problem with their motherboards, telling them to pretend they’d never heard about the issue and to “emphasize uncertainty.”

Uncertainly is exactly what consumers should be trying to avoid.

An earlier post on Engadget suggests a 97% failure rate!

According to recently released documents stemming from a three year-old lawsuit, Dell not only knew about the bogus components but some of its employees were actively told to play dumb, one memo sent to customer service reps telling them to “avoid all language indicating the boards were bad or had issues.” Meanwhile, sales teams were still selling funky OptiPlex machines, which during that period had a 97 percent failure rate according to Dell’s own study.

To be fair that still leaves a 3% chance of success — uncertainty isn’t gone yet.

Imagine 3% of an office working, or 3% of a student body getting their work done…

This is not just a problem with Dell or Gateway, of course. All manufacturers of technology equipment face the question of quality when building their products.

I noticed the D-Link DWL-3200AP, for example, was using low-ESR capacitor rated for only 1000 hours. This seems far below the normal use one might expect from a wireless bridge. Anyone could go buy a 7000-hour high-temp capacitor for less than a quarter.

Likewise, I found that the Motorola 2210-02 ADSL2+ broadband modem has a capacitor that fails due to load. It overheats and then shuts down the broadband link (perhaps you were wondering why this site went down for a day or two last month — thank you for the hits, and for exposing a hardware failure in my infrastructure). This is only marginally better than complete failure. It masks the cause by being intermittent, which is worse. Once I found the problem I was able to keep the link up by removing heat, which is why it is better.

Oh, and do not get me started on Apple hardware failures. I am on my third (and last) iPhone in only six months. The most recent failure was caused by a bad cable. Who puts six ribbon cables in a phone? This is a device that is totally sealed to consumers and constantly moved around. Ribbon cables are known to come loose. Put the two risk factors together…my phone was unsuable for two days (screen had limited functionality) and I spent two hours at an Apple store just to get the cable re-seated.

I would gladly have paid an extra dollar or two to avoid the multi-day outage. Two antenna cables, three data cables, and a screen cable; in other words, six too many:

The lesson seems to be that hardware quality continues to plague network devices with serious security (availability) consequences.

Product companies make decisions that might not reflect your requirements, but they also do not give much transparency prior to the purchase or readily accept fault afterwards. Buyer beware.

Here are a few suggestions for how to reduce hardware risks:

  1. Test – We would have found the UART failure quickly if we had just ordered one or two systems and run them through the paces
  2. Contract – Make certain that a failure of hardware is covered with warranty and perhaps even compensation
  3. Virtualize – Isolate hardware to a single highly-redundant device and then put the other devices into a virtual environment were you have more control and better logging options

WordPress SQL Attacks

This attack has been around a while, but an IP range in Belarus with a user-agent of Mozilla/4.0 appears to be trying it again. WordPress servers should be prepared for the old SQL attack.

Here are just two of the many attempt types:

?cat=999+UNION+SELECT+null,CONCAT(666,CHAR(58),user_pass,CHAR(58),666,CHAR(58)),null,null,null+FROM+wp_users+where+id=1/*

?cat=%2527+UNION+SELECT+CONCAT(666,CHAR(58),user_pass,CHAR(58),666,CHAR(58))+FROM+wp_users+where+id=1/*

This attack tries to expose the blog software’s admin (id=1) password. I guess 666 is a delimiter for someone — if successful the attack looks like it will generate a page with the admin password hash positioned between a pair of 666 and colons (CHAR58) like this:

666:PasswordHash:666

To check and see if you have been breached use a shell account and login to mysql:

mysql -pPassword -u Username Databasename

Then look for id=1

select * from wp_users where id=1;

This should show you the admin account information including the hashed password.

One form of prevention against these lame scripted attacks is to setup a WordPress blog with the wp_users column named something else, such as dudes or even just users. The problem with this is keeping WordPress and its plugins aware of a new column name. Defaults have the obvious risk but WordPress does not even allow the admin account name to be changed without directly editing the database.

Another way to reduce the effectiveness of scripted attacks is to use an application-level firewall like SEO Egghead’s plugin or PHPIDS.

War No Longer Exists

I continue to see interesting points raised by information technology security professionals getting dragged into traditional themes of power and politics, especially as they relate to war and cyberwar.

The BSides Denver conference, for example, led to a heated exchange between a military lawyer and his audience when he tried to differentiate between Cyber Attack and War. The Economist stoked things to a much wider audience with their latest issue. The Economist, for what it is worth as a conservative voice, has less concern than the Denver audience and essentially agrees with David Willson’s presentation.

It just occurred to me, however, to search my own blog for things I have written on war and cyberwar. Perhaps this is a good time to confess that I studied International History at the London School of Economics before I started working full time on information security. My research focused on post-WWII international relations, which to most people seems to mean war.

Thus it has been hard for me to avoid peppering this blog with the occasional thought on politics and wars. That is my excuse anyway.

Here is a fine example I posted in 2005 regarding a book by General Sir Rupert Smith called “The Utility of Force: The Art of War in the Modern World”:

Battles just don’t work any more. War is now waged not in the field but the street, so victory is possible only with the people’s consent

His book should have been titled The Art of Waging an Act Formerly Known as War. But seriously the term War has its own definition that is separate and distinct from modifiers. Civil War means something different from just War, in other words. Likewise Cyber War should be held to mean something different from War. In that sense, I can see how the case could be made that War alone may no longer exist.

Cloud Security for Home and SMB

I see increasing evidence that the cloud is drifting into the home and small to midsize business (SMB) market. This is a great thing for security, but also should raise concern.

Take for example inexpensive network attached storage (NAS) devices. Only a few hundred dollars will get a self-contained box with RAID and network services. Several terabytes in a redundant array on the network is a great thing for a home or SMB that wants to safely back up data. The next step in data availability is to start to rotate backups to an off-site location.

Enter the cloud.

Service providers like DropBox or CTERA offer to replicate the data from a NAS. Here is some typical marketing information I found on the CTERA site:

Before data is sent from the Cloud Attached Storage appliance to its online backup destination, it is encrypted using 256-bit AES (Advanced Encryption Standard). This is a highly secure encryption algorithm, approved as safe enough for protecting U.S. government classified material, and widely used by banks.

Highly secure? Very convincing. Oh, wait, do they mean widely used by the government agencies and banks that still get breached? I do not find this kind of vague industry reference very reassuring, but maybe I know too much. They also offer SSL for confidentiality in transmission and SHA-1 for data integrity. Nice to see standards.

Moving on, I noted their explanation of key management. After all, this is what really matters in the world of encryption when it comes to getting a secure service.

Passwords are required to access online backup versions of your data. You may choose between two options of passphrase protection:
* An automatically-generated key: This offers the ability to reset the key if it is forgotten.
* A personal passphrase: In this case, you choose a passphrase known only to you. While this offers an additional level of privacy, it also means that if the passphrase is forgotten, the protected data will not be retrievable at all.

The first option is not explained clearly. Many consumers probably will not realize that the ease of resetting a key is inversely related to the safety of their data in the cloud. How is the reset handled? I see the “additional level of privacy” in option two as really the baseline, not something extra. I would warn customers that using a reset option is below a baseline of privacy, like leaving their front door key under the mat.

A big question for the cloud provider is whether there is more risk in someone attacking the reset mechanism and compromising encrypted storage or if there is more risk in customers losing their keys. Helpdesk and support costs might typically be considered higher for more secure options. However, it seems to me that since they offer a backup service and not primary data access they should still encourage customers to lean away from any convenient reset options. Alternatively they could add support for change/access logging and alerting for data in the cloud.