Category Archives: Security

Dangers in Predicting the Future With Data

Mike Greenfield has some really insightful things to say on his blog about big data statistical risk and the difficulty in predicting human behavior. Take for example his experience with starting a company, which proved how dangerous it was to rely on a sole supplier.

So Facebook acted rationally, optimizing for their own best interests and those of their users. They killed the notifications feature (which we used to tell someone her friend’s child was turning two). They removed boxes and tabs from profile pages (which over a million moms had added to show off their kids’ accomplishments). And they hid invitations (which moms used to tell their friends about our product).

At that time, we were almost completely dependent on Facebook’s channels to communicate with our users and find new ones. We felt like a beer maker preparing for the government banning beer sales in markets, shutting down bars, and only allowing people to drink in restaurants on Tuesdays. Not quite prohibition, but pretty darned close.

I want reiterate that Greenfield is in the business of predicting human behavior based on data analysis. Although he says “Facebook acted rationally” he actually started his blog post with “Facebook, the VCs said, could suddenly turn off all of their communication channels and we’d collapse. We thought they were full of it…”.

Why didn’t he see it coming?

It sounds to me that VCs predicted the danger of losing a sole supplier. That makes sense in a simple predictive risk model. A “rational” behavior model for suppliers who see economic opportunity, however, is a complex and messy business. It really shouldn’t be so casually described as if a supplier who kills their distribution channel is predicted easily or is rational/optimizing.

Although I love the prohibition analogy it probably is not for the reasons Greenfield uses it. Prohibition is a good example of bad regulation and resulting security risks.

Consider for a moment how the consumption of alcohol actually increased in America after it was banned. If Facebook’s regulation of data were like prohibition then we should predict an illegal data running/smuggling boom.

That didn’t happen, as documented by Greenfield. Instead his story centers on “cutting the cord” and walking away from Facebook forever.

Also consider that prohibition in America was led by popular religious extremists (well, popular in Kansas anyway) who violently forced into power a bunch of blatent hypocrites.

The “conservative” politicians who said they favored a “dry” country ended up meaning someone who drank but refused to admit it. In today’s terms it is similar in nature to the radically homophobic politicans.

Those calling for regulation thus can be mired in complex psychological and cultural issues, which makes “rational” predictions of their economic behavior less than obvious. Was Greenfield accounting for a fundamentalist Carrie Nation element to Facebook when he was threatened by “hatchetation” of his data?

The really interesting point of Greenfield’s story is that at the same time he (like most people) predicted a demise of email and replacement with social networking (risk of staying on email), he also was using the venerable traditional direct-communication path of email to save his company from destruction.

As 2010 came to a close, the proverbial feces was hitting the proverbial fan, and we started to look at email as a way out of the ditch. […] Over the course of 2011, we streamlined our content-writing and emailing operations, in the process turning email into a viable re-engagement channel for millions of moms.

The lesson of course is to predict and manage risk related to distribution channels to your customers, which is what the VCs told them in the first place. It sounds to me had he followed his own risk analysis based on a prediction of the future he would have been far worse off. In other words don’t stop using email unless you realize the true risk of giving up ownership and control over your communication.

Fast forward to Greenfield’s more recent post called “Predicting the Future is the Future” and he extols automation.

Automation is incredibly important. It democratizes the process of building and using statistical models, so that a small startup (with lots of data) can build pretty good statistical models without a team of statisticians. These automated statistical models will almost inevitably perform more poorly than their human-built counterparts, but they’re close enough to be competitive.

I really want to agree with him, because technology can make data more accessible and therefore more democratic. Giving out statistical model tools to everyone means they too can start a company and make money from mining your personal data.

But again he leaves out an essential part of behavior — who gets to own and control access to data. This part of risk has to be better defined before we can celebrate democracy and a risk reduction.

His description of the troubles with Facebook give a clear example of how automation can be rendered completely useless — it runs straight into severe power inequality in terms of resource control and management risks.

Alas, back to the Facebook prohibition analogy, every farm in America used to have an apple tree, if not an orchard. Yet the saying “as American as apple pie” is a subtle reminder of the strange story of hard cider in America.

150 years ago, in the 1840s, hard cider held the position now held by beer as the preferred alcoholic beverage of the working class.

Where did it go? It turns out that while technology democratized the process of building farms and making goods it alone was unable to prevent the extinction of the preferred beverage in America.

…the temperance movement remains as a major culprit responsible for the decline of cider consumption in the U.S., but the association of cider with rural WASP culture was the added factor which distinguishes cider from beer or wine. Add to this the economics of beer production, growing urbanization, German immigration, a predatory beer industry, and a substitute drink in coca-cola, and there seems to be enough factors working together to explain why and how cider so completely disappeared.

A statistician looking at data in 1840 might have said cider was the future, but the question is whether they could or would have predicted a much more complicated mix of risk factors related to irrational human behavior (e.g. religious fervor and ethnic prejudice) that killed the market.


England’s farmers were insulated from the risk of politics and industry in early 1900s America, so they still make cider:

cider at Broome farm
Source: Broome Farm on Flickr.

Mother Earth News says it is not too late to learn how to make your own American cider…assuming you can find a reliable apple distributor.

Performics Market Data and Analysis

Performics has a slideshare deck called “Life on Demand 2012 Summary” with some data points and analysis that caught my attention. I was hoping for some insight into phishing and social engineering attack vectors.

It suggests, for example, that people who take 25 minutes to complete an online survey prefer being online to “traditional modes”…

Maybe it’s just me but if there was a “five point scale” to agree or disagree then a single 52% value is obscuring the data distribution. Wonder why they didn’t ask agree/disagree in the first place.

Also curious if the test were on a social media site whether the number would change? And while many of the slides claim that they can reveal trends, there is no timeline/timeframe and no prediction of a future value. The last page in the survey also reveals that about 60% of survey respondants are not employed.

US Sailing Report on Farallones Tragedy

A US Sailing Farallones Panel Report has been posted with detailed analysis of the Low Speed Chase Capsize on 14 April 2012.

Four safety issues are explained. The first is that the crew sailed too close to shore.

As a result of the panel’s investigation, it became clear that the cause of the capsize was that Low Speed Chase sailed a course which took them across a shoal area over which breaking waves could be expected to occur several times per hour (see Appendix D) and encountered a breaking wave, which capsized the boat.

[…]

With a forecast for swells up to 15 feet, a maximum wave height of 30 feet would be expected, and 1% of waves (two or three per hour) would be expected to average 25 feet in height. The forecast wind waves would add two or three feet to the maximum wave. (See Appendix D)

The remaining three issues are related to adequate safety gear, communication and incident response procedures. Other sailors are also called out in the report for the decision to not assist.

Of the seven other race boat crews interviewed who witnessed the incident, all deemed the conditions too dangerous to physically stand by and attempt to render assistance. All continued racing.

[…]

22 boats heard the radio traffic concerning the LSC incident and five respondents saw Low Speed Chase on the shore, while one actually saw the capsize.

AT&T Announces End of 2G

AT&T just filed a 10-Q with the SEC and publicaly confirmed what the company has been warning in private for the past two years:

Also as part of our ongoing efforts to improve our network performance and help address the need for additional spectrum capacity, we intend to redeploy spectrum currently used for basic 2G services to support more advanced mobile Internet services on our 3G and 4G networks.

[…]

We expect to fully discontinue service on our 2G networks by approximately January 1, 2017.

[…]

As of June 30, 2012, approximately 12 percent of our postpaid customers were using 2G handsets.

A 5 year sunset plan seems like a long time for those of us who would argue 2G should be described as a terribly weak and dated protocol.

Any further delay is especially bad news for Apple customers who are unable to choose 3G-only (i.e. iPhone and iPad). (Another reason I recommend the Nokia N9 is the option to disable 2G communication).

2G, or 2nd Generation, was launched in Finland in 1991. How many electronic devices are you using today that are 22 years old? More to the point, 2G is older than the web and pre-dates the “data” revolution in communication. It also used a security-through-obscurity method, which became untenable by the mid 1990s. Although 2G had some functionality limitations fixed through extensions (2.5G) it never really fixed the security problems. Instead a 3G network was started in 1992 and by 2001 was launched in Japan. The path to far better performance and security should be crystal clear.

Yet AT&T doesn’t mention security in their filing as one of the reasons for ending their old network. Perhaps they don’t want to draw attention to the fact that it is trivial to impersonate a GSM base transceiver station (BTS). Or maybe they don’t want to mention that the fixed network is unprotected, encryption is weak (COMP128 implementation of the A3 and A8 algorithms can be broken in less than a minute), encryption is often disabled and/or completely useless (keys sent in the clear), there is no integrity or network identity…and so forth.

The AT&T filing says they have just over 100 million customers. So the end of service for 2G, which they say is 12%, must be around 12 million customers. That sounds like a lot of vulnerable end-users until you take a closer look at usage profiles. It is tempting to think of the numbers in terms of consumer handhelds. In fact this announcement has more relevance to appliance-like devices such as ATMs, Point-of-Sale and security alarms.

So the problem of 2G is not really about people who refuse to buy a new phone. There might be a few of those but most humans tend to frequently update their phones for a number of simple functionality reasons from dead batteries to better signal while moving around. Users also tend to absorb some of the replacement procedure costs.

The embedded device market however has a harder time discontinuing deployed assets and dealing with the cost of re-provisioning. Embedded devices tend to have a if-it-ain’t-broke-don’t-fix-it mentality for upgrades. Embedded devices also may drop down to 2G to provide service continuity. A message getting through often gets higher priority than a message being kept a secret; instead of demanding better service/coverage from AT&T, 2G may be given as an availability option. Unfortunately, embedded devices tend to be used for applications that are security-related and need confidentiality to be a priority.

In other words, AT&T could probably greatly accelerate the adoption of 3G and newer networks for millions of remaining devices if they admitted or otherwise raised awareness of serious security issues in 2G. In the meantime I suspect some may continue selling 2G as deceptively “inexpensive” and “reliable” option right up to the end of service in 2017.