Deep Freeze 9: 2007

Friday, December 28, 2007

Internet stability technology exploited by hackers

In a deliciously devious example of unintended consequences, hackers are exploiting a technology to improve internet stability to hide their malware sites. Andy Greenberg reports for Forbes in Future Phishing, 12/28/2007:

"Fast flux takes advantage of an option built into the Web's architecture, which allows site owners to switch the Internet Protocol address of a site without changing its domain name. The IP-switching function is meant to create more stability on the Web by allowing an overloaded Web site to switch servers without a hiccup. But cybercriminals using fast flux take advantage of the option to move the physical location of their malicious sites every few minutes, making them much harder to block or shut down."

Another lesson, if any were needed, that all technologies are double-edged.

Wednesday, November 07, 2007

European vs American regulation

An opinion piece in The Economist provides an elegant contrast between policy approaches on either side of the Atlantic. It notes that there's competition to set global regulatory standards, and that Europe seems to winning. For example, American multinationals that invest in meeting European standards may decide that lighter domestic regulation provides a competitive advantage to non-exporting rivals, and may push for stricted US policies. For global companies, it's simplest to be bound by the toughest regulations in your supply chain.

From Brussels rules OK: How the European Union is becoming the world's chief regulator, The Economist, Sep 20th 2007:

The American model turns on cost-benefit analysis, with regulators weighing the effects of new rules on jobs and growth, as well as testing the significance of any risks. Companies enjoy a presumption of innocence for their products: should this prove mistaken, punishment is provided by the market (and a barrage of lawsuits). The European model rests more on the “precautionary principle”, which underpins most environmental and health directives. This calls for pre-emptive action if scientists spot a credible hazard, even before the level of risk can be measured. Such a principle sparks many transatlantic disputes: over genetically modified organisms or climate change, for example.

In Europe corporate innocence is not assumed. Indeed, a vast slab of EU laws evaluating the safety of tens of thousands of chemicals, known as REACH, reverses the burden of proof, asking industry to demonstrate that substances are harmless. Some Eurocrats suggest that the philosophical gap reflects the American constitutional tradition that everything is allowed unless it is forbidden, against the Napoleonic tradition codifying what the state allows and banning everything else.

The oil industry collects 51 cents in federal subsidies for every gallon of ethanol it mixes with gas and sells as E10

BusinessWeek reports that oil companies are trying to stop the spread of E85, a fuel that is 85% ethanol and 15% gas (Big Oil's Big Stall On Ethanol, 1 Oct 2007).

Some academics claim that ethanol takes more energy to produce than it supplies. This is contested; but even a UC Davis study says the energy used to produce ethanol is about even with what it generates, and that cleaner emissions would be offset by the loss of pasture and rainforest to corn-growing.

There's also a nasty little problem with E85: drivers apparently lose about 25% in fuel economy with E85.

Tuesday, November 06, 2007

Gardening the Web

I believe that it’s productive to represent the internet/web as a complex human system, but that’s an abstract concept that’s hard to grasp. A metaphor that everyone’s familiar with can enliven this idea: The internet/web as a global collection of gardens, and making policy is like gardening.

Just like a garden, the internet/web has a life of its own, but can be shaped by human decisions. A garden is neither pure nature, nor pure culture; it’s nature put to the service of culture. The “nature” of the internet/web discourse is its technology and commerce, separate from the “culture” of politics and policy. Few would claim that the internet/web should be left entirely to a laissez faire markets; it is also a social good, and some intervention is needed to protect the public interest.

Before delving further into the analogy between gardening and making communications policy, here is a summary of the properties of complex systems which apply to both:

Hierarchy: systems consist of nested subsystems with linked dynamics at different scales
Holism: the whole has properties different from collection of separable parts
Self-Organization: systems organize themselves, and their characteristic structural and behavioral patterns are mainly a result of interaction between the subsystems
Surprise and Novelty: one cannot predict outcomes of interventions with any accuracy; ny given model under-represents the system
Robust Yet Fragile: a system is stable in a large variety of situations, but can break down unexpectedly

Just like the internet/web, there a many kinds of gardens. They vary in scale from window-sill planters to national forests, in governance from personal to public and commercial. Some objectives of gardening are utilitarian, and others aesthetic; some see gardens as primarily productive and others cultivate them for pleasure. Likewise, some see the internet/web as tool, and others as a source of meaning.

While most of the work in a garden is done automatically by the plants and other providers of ecosystem services, humans impose their desires regarding outcomes; similarly, internet/web innovation is driven by entrepreneurs and technologists according to their own agendas, though governments try to impose their will on the outcomes.

Just like the internet/web, managing a garden is often a futile matter; one can never know precisely how things will turn out. Plants that thrive in the garden next door inexplicably languish in yours. Plagues of pests and disease appear unexpectedly. Unexpected consequences abound. For example, people using imidacloprid to control grubs in their lawns may be causing the collapse of bee hives across North America (more).

Just like the internet/web, one can’t stop things coming over the fence from the neighbor’s garden. Birds, squirrels, slugs, and seeds don’t respect boundaries. A garden is embedded in a larger regional system, and its borders are porous. While every gardener can and should shape the garden to their preferences, there is a limit to their independence. The openness brings both plant-friendly bees and bird-chasing cats. Tension with neighbors is inevitable, but can be managed. There is management at many scales, from a gardener’s decision about what variety of tomato to plant for next year, to state-wide prohibitions on planting noxious weeds.

The old silos of traditional communications regulation are like formal gardens or regimented farming. Everything is neat and in its place. There is relatively little variety in the composition and output of the cultivation, and the managers are few and well-defined. Today’s internet/web is more like a patchwork of allotments and wilderness. Control is decentralized, and there is much more variety.

This description of the internet/web as a garden is of course incomplete – like any complex system, different perspectives of the internet will each reveal truths regarding that system that are neither entirely independent nor entirely compatible. The garden metaphor, built on the analogy of the internet/web as a complex system, captures a lot of the key dynamics. It fits with other place-based metaphors for the web (as a building, market, library, or public venue). There is a resonance with tool metaphors, since gardens are as a means to an end, whether pleasure or production. The link to the “internet as communications infrastructure” metaphor is less direct, but they don’t contradict each other.

Monday, October 08, 2007

Mind-boggling

Not the usual fare for this blog, but I found this tidbit amazing: A female Bar-tailed Godwit touched down in New Zealand following an epic, 18,000-mile-long (29,000 km) series of flights tracked by satellite, including the longest non-stop flight recorded for a land bird: a non-stop flight of more than eight days and a distance of 7,200 miles, the equivalent of making a roundtrip flight between New York and San Francisco, and then flying back again to San Francisco without ever touching down.

For more, see USGS story, USGS satellite tracks, and APSN summary.

Thursday, September 27, 2007

Lobbying Pays

The September 17th BusinessWeek has a fascinating look Inside The Hidden World Of Earmarks. The tell the story of how the Navy got itself an executive jet that the Pentagon didn’t ask Congress for. Gulfstream lobbied heavily, and the Navy got special funding, known as an earmark. The Georgia Congressional delegation, and Senator Saxby Chambliss in particular, were very helpful - no surprise, since the Gulfstream is built in that state.

BusinessWeek concludes that on average, companies generated roughly $28 in earmark revenue for every dollar they spent lobbying. The top twenty in this game in $100 or more for every dollar spent. For context, the magazine provides these factoids: companies in the Standard & Poor's 500-stock index brought in just $17.52 in revenues for every dollar of capital expenditure in 2006.

In Gulfstream’s case, that exec jet deal was worth $53 million. It was just one of 29 earmarks valued at $169 million given to General Dynamics (its parent) or its subsidiaries that year; a nifty 30:1 ROI given that the company spent only $5.7 million on lobbying in 2004.

Wednesday, September 26, 2007

How statutes learn

It’s a truism that rewriting telecoms law is so hard that the US only managed to do it twice in the last one hundred years. But somehow the Congress and the regulatory agencies stay busy, and stuff changes around the edges.

I was suddenly reminded of Steward Brand’s wonderful book “How buildings learn”. (If you have any interest in either architecture or history, I strongly recommend it.) He espouses an onion-layer model of buildings. Quoting from the book:

Site - This is the geographical setting, the urban location, and the legally defined lot, whose boundaries and context outlast generations of ephemeral buildings. "Site is eternal."

Structure - The foundation and load-bearing elements are perilous and expensive to change, so people don't. These are the building. Structural life ranges from 30 to 300 years, though few buildings make it past 60, for other reasons).

Skin - Exterior surfaces now change every 20 years or so, to keep up with fashion or technology, or for wholesale repair. Recent focus on energy costs has led to re-engineered Skins that are air-tight and better-insulated.

Services - These are the working guts of a building: communications wiring, electrical wiring, plumbing, sprinkler system, HVAC (heating, ventilating, and air conditioning), and moving parts like elevators and escalators. They wear out or obsolesce every 7 to 15 years. Many buildings are demolished early if their outdated systems are too deeply embedded to replace easily.

Space Plan - The Interior layout: where walls, ceilings, floors, and doors go. Turbulent commercial space can change every 3 years or so; exceptionally quiet homes might wait 30 years.

Stuff - Chairs, desks, phones, pictures; kitchen appliances, lamps, hairbrushes; all the things that twitch around daily to monthly. Furniture is called mobilia in Italian for good reason.

Brand argues that because the different layers have different rates of change, a building is always tearing itself apart. If you want to build an adaptive structure, you have to allow slippage between the differently-paced systems. If you don’t, the slow systems block the flow of the quick ones, and the quick ones tear up the slow ones with their constant change. For example, timber-frame buildings are good because they separate Structure, Skin and Services; “slab-on-grade” (pouring concrete on the ground for a quick foundation) is bad because pipes are buried and inaccessible, and there’s no basement space for storage, expansion, and maintenance functions.

He quotes the architectural theorist Christopher Alexander: “What does it take to build something so that it’s really easy to make comfortable little modifications in a way that once you’ve made them, they feel integral with the nature and structure of what’s already there? You want to be able to mess around with it and progressively change it to bring it into an adapted state with yourself, your family, the climate, whatever. This kind of adaptation is a continuous process of gradually taking care.”

There seems to be an analogy to policy making. Some things that are almost eternal, just like Site: the regulatory imperatives like taxation, public safety, and economic growth. Legislative Acts are like the slowly-changing Structure and Skin. The trade-offs and compromises they represent are hard to build, and so they’re slow to change. Then we get to regulatory rulings made within the context of legislation, the working guts of applying laws to changing circumstances and fine-tuning the details – these are like Services and Space Plan, which change every 3 – 15 years. Finally, like the Stuff in homes that move around all the time, we have the day-to-day decisions made by bureaucrats applying the regulations.

This kind of model also gives a way to ask, restating Christopher Alexander slightly, “What does it take to craft legislation so that it’s really easy to make comfortable little modifications in a way that once you’ve made them, they feel integral with the nature and structure of what’s already there?”

I imagine that DC operatives do this instinctively – but perhaps an architectural metaphor could make the process even more efficient.

Thursday, September 20, 2007

Why we need stories

One might be able to explain the Nassim Taleb's “narrative fallacy” (see The Black Swan) partly by invoking Patrick Leman’s “major event, major cause” reasoning. Leman describes it thus in The lure of the conspiracy theory (New Scientist, 14 Jul 07):

“Essentially, people often assume that an event with substantial, significant or wide-ranging consequences is likely to have been caused by something substantial, significant or wide-ranging.

“I gave volunteers variations of a newspaper story describing an assassination attempt on a fictitious president. Those who were given the version where the president died were significantly more likely to attribute the event to a conspiracy than those who read the one where the president survived, even though all other aspects of the story were equivalent.

“To appreciate why this form of reasoning is seductive, consider the alternative: major events having minor or mundane causes - for example, the assassination of a president by a single, possibly mentally unstable, gunman, or the death of a princess because of a drunk driver. This presents us with a rather chaotic and unpredictable relationship between cause and effect. Instability makes most of us uncomfortable; we prefer to imagine we live in a predictable, safe world, so in a strange way, some conspiracy theories offer us accounts of events that allow us to retain a sense of safety and predictability.”

Taleb’s account of our inclination to narrate is psychological: “It has to do with the effect of order on information storage and retrieval in any system, and it’s worth explaining here because of what I consider the central problems of probability and information theory. The first problem is that information is costly to obtain. . . The second problem is that information is also costly to store . . . Finally, information is costly to manipulate and retrieve.” (The Black Swan, his italics, p. 68)

He goes on to argue that narrative is a way to compress of information. I suspect that the compression is related to extracting meaning, not the raw information. The long-term storage capacity of the brain seems is essentially unbounded, but our ability to manipulate variables in short-term memory is very limited, to about four concurrent items. Stories provide a useful chunking mechanism: they’re pre-remembered frames for relationships. There is a relatively limited number of story shapes and archetypical character roles (cf. The Seven Basic Plots) in which cause and effect is carefully ordered and given meaning.

Taleb comes even closer to Leman when he connects the narrative fallacy with the desire to reduce randomness: “We, the members of the human variety of primates, have a hunger for rules because we need to reduce the dimension of matters so they can get into our heads. Or, rather, sadly, so we can squeeze them into our heads. The more random information is, the greater the dimensionality, and thus the more difficult to summarize. The more you summarize, the more order you put in, the less randomness. Hence the same condition that makes us simplify pushes us to think that the world is less random than it actually is.” (The Black Swan, his italics, p. 69) As Leman says, “we prefer to imagine we live in a predictable, safe world.”

Friday, September 14, 2007

Algorithms Everywhere

The Economist this week writes about the increasing use of algorithms, for everything from book recommendations to running supply chains (Business by numbers, 13 Sep 2007). It suggests that algorithms are now pervasive.

The most powerful algorithms are those that do real-time optimization. They could help UPS recalibrate deliveries on the fly, and reorder airplane departure queues at airports to improve throughput. More down-to-earth applications include sophisticated calculations of consumer preference that end up predicting where to put biscuits on supermarket shelves.

If that’s true, the underlying fragility of algorithms is now pervasive. The fragility is not just due to the risk of software bugs, or vulnerability to hackers; it’s also a consequence of limitations on our ability to conceive of, implement, and manage very large complex systems.

The day-to-day use of these programs shows that they work very well almost all of the time. The occasional difficulty – from Facebook 3rd party plug-in applications breaking for mysterious reasons to the sub-prime mortgage meltdown – reminds us that the algorithmic underpinnings to our society are not foolproof.

In the same issue, the Economist reviews Ian Ayres’s book Super Crunchers: Why Thinking-by-Numbers Is the New Way to Be Smart about automated ddecision making (The death of expertise, 13 Sep 2007). According to their reading of his book, “The sheer quantity of data and the computer power now available make it possible for automated processes to surpass human experts in fields as diverse as rating wines, writing film dialogue and choosing titles for books.” Once computers can do a better job at diagnosing disease, what’s left for the doctor to do? Bank loan officers have already faced this question, and had to find a customer relations job. I used to worry about the employment implications; I still do, but now I also worry about relying on complex software systems.

Sunday, September 09, 2007

Software: complex vs. complicated

Homer-Dixon’s The Ingenuity Gap helped me realize that perhaps the difference between software and more traditional engineering is that bridges (say) are complicated, while software is complex. I follow the distinction I’ve seen in the systems theory literature that complicated refers to something with many parts, whereas complex refers to unpredictable, emergent behavior. Something complicated may not be complex (e.g. a watch), and a complex system might not be complicated (e.g. a cellular automaton).

A large piece of code meets the criteria for a complex adaptive system: there are many, densely interconnected parts which affect the behavior of the whole in a non-linear way that cannot be simply understood by looking at the components in isolation. A code module can be tested and defined individually, but its behavior in the context of all the other components in a large piece of code can only be observed – and even then not truly understood – by observing the behavior of the whole. If software were linear and simply complicated, extensive testing wouldn’t be required after individually debugged modules are combined into a build.

Some have compared software to bridges while calling (properly) for better coding practices. Holding software development to the same standards as other facets of critical infrastructure is questionable, however, because a software program is to a bridge as the progress of a cocktail party is to a watch. Both parties and watches have a large number of components that can be characterized individually, but what happens at a given party can only be predicted in rough terms (it’s complex because of the human participants) while a watch’s behavior is deterministic (though it’s complicated).

This challenge in writing software is of the “intrinsically hard” kind. It is independent of human cognition because it catch you eventually, no matter how clever or dumb you are (once you’re at least smart enough to complex software systems at all).

Geek’s addendum: Definitions of complexity

Home-Dixon’s definition of complexity has six elements. (1) Complex systems are made up a large number of components (and are thus complicated, in the meaning above). (2) There is a dense web of causal connections between the parts, which leads to coupling and feedback. (3) The components are interdependent, i.e. removing a piece changes the function of the remainder. (I think this is actually more about resilience than complexity.) (4) Complex systems are open to being affected by events outside their boundaries. (5) They display synergy, i.e. the combined effect of changes to individual components differs in kind from the sum of the individual changes. (6) They exhibit non-linear behavior, in that a change in a system can produce an effect that disproportional to the cause.

Sutherland and Van den Heuvel (2002) analyze the case of enterprise applications built using distributed object technology. They point out that such systems have highly unpredictable, non-linear behavior where even minor occurrences might have major implications, and observe that recursively writing higher levels of language that are supported by lower level languages, as source of the power of computing, induces emergent behaviors. They cite Wegner (1995) as having shown that interactive systems are not Turing machines: “All interactions in these systems cannot be anticipated because behavior emerges from interaction of system components with the external environment. Such systems can never be fully tested, nor can they be fully specified.” They use Holland’s (1995) synthesis to show how enterprise application integration (EAI) can be understood as a complex adaptive system (CAS).

References

Sutherland, J. van den Heuvel, W.-J. (2002). "Enterprise application integration encounters complex adaptive systems: a business object perspective.” HICSS. Proceedings of the 35th Annual Hawaii International Conference on System Sciences, 2002.

Wegner, P (1995). “Interactive Foundations of Object-Based Programming.” IEEE Computer 28(10): 70-72, 1995.

Holland, J. H. (1995). Hidden order: how adaptation builds complexity. Reading, Mass., Addison-Wesley, 1995.

Saturday, September 08, 2007

The Ingenuity Gap

Lucas Rizoli pointed me to Thomas Homer-Dixon’s The Ingenuity Gap, which sheds useful light on the hard intangibles problem. Homer-Dixon argues that there’s a growing chasm between society’s rapidly rising need for ingenuity, and its inadequate supply. More ingenuity is needed because the complexity, unpredictability, and pace of events in our world, and the severity of global environmental stress, is soaring. We need more ideas to solve these technical and social problems. Defined another way, the ingenuity gap is the distinction “between the difficulty of the problems we face (that is, our requirement for ingenuity) and our delivery of ideas in rsponse to these problems (our supply of ingenuity).”

Homer-Dixon brings a political scientist’s perspective to the problem, discoursing urbanely and at length (for almost 500 pages) on science, politics, personalities and social trends. He focuses on complexity as the root problem, rather than - as I have - on cognitive inadequacy. In my terminology, he concentrates on intrinsically hard problems rather than ones that are hard-for-humans. Complexity is hard no matter how smart we might be, which is why I’d call it intrinsic.

I’ve recently started thinking about complexity theory in an attempt to find new metaphors for the internet and the web, and it’s a telling coincidence that it’s coming up in the context of hard intangibles, too. It’s a useful reminder that while cognitively complexity limits our control of the world, a lot of it is instrinsically uncontrollable.

Friday, August 31, 2007

Limits to abstraction

Suze Woolf put me on to Grady Booch’s Handbook of Software Architecture, which aims to codify the architecture of a large collection of interesting software-intensive systems, presenting them in a manner that exposes their essential patterns, and that permits comparisons across domains and architectural styles.

Booch mentions in passing on the Welcome page that “abstraction is the primary way we as humans deal with complexity”. I don’t know if that’s true; it sounds plausible. It’s definitely true that software developers deal with complexity this way, creating a ladder of increasingly succinct languages that are ever further away from the nitty gritty of the machine. While there are huge benefits in productivity, there’s also a price to pay; as Scott Rosenberg puts it in Dreaming in Code, “It's not that [developers] wouldn't welcome taking another step up the abstraction ladder; but they fear that no matter how high they climb on that ladder, they will always have to run up and down it more than they'd like--and the taller it becomes, the longer the trip.”

The notion of “limits to abstraction” is another useful way to frame the hard intangibles problem.

These limits may be structural (abstraction may fail because of the properties of a problem, or the abstraction) or cognitive (it may fail because the thinker’s mind cannot process it). In The Law of Leaky Abstractions (2002), Joel Spolsky wrote (giving lots of great examples) “All non-trivial abstractions, to some degree, are leaky. Abstractions fail. Sometimes a little, sometimes a lot. There's leakage. Things go wrong. It happens all over the place when you have abstractions. . . . One reason the law of leaky abstractions is problematic is that it means that abstractions do not really simplify our lives as much as they were meant to.”

There’s more for me to do here, digging into the literature on abstraction. Kit Fine’s book, The Limits of Abstraction (2002) could be useful, though it’s very technical – but at least there have been lots of reviews.

Tuesday, August 28, 2007

The Non-conscious

An essay by Chris Frith in New Scientist (11 August 2007, subscribers only) on difficulties with the notion of free will contains a useful list of experiments and thought-provoking findings.

He reminds us of Benjamin Libet’s 1983 experiment indicating that decisions are made in the brain before our consciousness is aware of it:

“Using an electroencephalogram, Libet and his colleagues monitored their subjects' brains, telling them: "Lift your finger whenever you feel the urge to do so." This is about as near as we get to free will in the lab. It was already known that there is a detectable change in brain activity up to a second before you "spontaneously" lift your finger, but when did the urge occur in the mind? Amazingly, Libet found that his subjects' change in brain activity occurred 300 milliseconds before they reported the urge to lift their fingers. This implies that, by measuring your brain activity, I can predict when you are going to have an urge to act. Apparently this simple but freely chosen decision is determined by the preceding brain activity. It is our brains that cause our actions. Our minds just come along for the ride.”

Frith recalls the Hering illusion, where a background of radiating lines makes superposed lines seem curved. Even though one knows “rationally” that the lines are straight, one sees them as curved. Frith uses this as an analogy for the illusion that we feel as if we are controlling our actions. To me, this illusion (and perhaps even more profoundly, the Hermann-grid illusion) points to the way different realities can coexist. There is no doubt that humans experience the Hering lines as curved, and that we see shadows in the crossings of the Hermann grid. Likewise, there is no doubt that many (most?) humans have an experience of the divine. The divine is an experiential reality, even if it mightn’t exist by some objective measures.

Other results mentioned include Patrick Haggard’s findings that the act of acting strengthens belief in causation; Daniel Wegner’s work on how one can confuse agency when another person is involved; and work by various researchers on how people respond to free riders; and Dijksterhuis et al’s work on non-conscious decision making, which I discussed in Don’t think about it.

Black Swans in the Economist

A recent Economist (vol 384, no. 8542, 18-24 August 2007) had two great pieces of evidence for Nassim Nicholas Taleb’s contentions in The Black Swan that we never predict the really important events, and that Wall Street is dangerously addicted to (Gaussian) models that do not correctly reflect the likelihood of very rare events.

From On top of everything else, not very good at its job, a review of a history of the CIA by Tim Weiner:

“The CIA failed to warn the White House of the first Soviet atom bomb (1949), the Chinese invasion of South Korea (1950), anti-Soviet risings in East Germany (1953) and Hungary (1956), the dispatch of Soviet missiles to Cuba (1962), the Arab-Israeli war of 1967 and Saddam Hussein's invasion of Kuwait in 1990. It overplayed Soviet military capacities in the 1950s, then underplayed them before overplaying them again in the 1970s.”

From The game is up, a survey of how the sub-prime lending crisis came about:

“Goldman Sachs admitted [that their investment models were useless] when it said that its funds had been hit by moves that its models suggested were 25 standard deviations away from normal. In terms of probability (where 1 is a certainty and 0 an impossibility), that translates into a likelihood of 0.000...0006, where there are 138 zeros before the six. That is silly.”

Saturday, August 25, 2007

Programs as spaces

Paul Graham's essay Holding a Program in One’s Head describes how a good programmer immersed in their code holds it in their mind: "[Mathematicians] try to understand a problem space well enough that they can walk around it the way you can walk around the memory of the house you grew up in. At its best programming is the same. You hold the whole program in your head, and you can manipulate it at will."

The article’s mainly concerned with the organizational consequences of needing to "load the program into your head” in order to do good work. But I want to focus on the spatial metaphor. Thinking through a program by walking through the spaces in your head is an image I've heard other programmers use, and it reminds me of the memory methods described by Frances Yates in The Art of Memory. (While Graham does make reference to writing and reading, I don't think this is aural memory; his references to visualization seem more fundamental.)

I wonder about the kind of cognitive access a programmer has to their program once it’s loaded. Descriptions of walking through a building imply that moment-by-moment the programmer is only dealing with a subset of the problem, although the whole thing is readily available in long-term memory. He’s thinking about the contents of a particular room and how it connects with the other rooms, not conceptualizing the entire house and all its relationships at the same instant. I imagine this is necessarily the case, since short-term memory is limited. If true, this imposes limitation on the topology of the program, since the connections between different parts are localized and factorizable – when you walk out of the bedroom you don’t immediately find yourself in the foyer. Consequently, problems that can’t be broken down (or haven’t been broken down) into pieces with local interactions of sufficiently limited scope to be contained in short term memory will not be soluble.

Graham also has a great insight on what makes programming special: "One of the defining qualities of organizations since there have been such a thing is to treat individuals as interchangeable parts. This works well for more parallelizable tasks, like fighting wars. For most of history a well-drilled army of professional soldiers could be counted on to beat an army of individual warriors, no matter how valorous. But having ideas is not very parallelizable. And that's what programs are: ideas." Not only are programming tasks not like fighting wars as Graham imagines them; they're not like manufacturing widgets either. The non-parallelizability of ideas implies their interconnections, and here we have the fundamental tension: ideas may be highly interlaced by their nature, but the nature of the brain limits the degree to which we can cope with their complexity.

Monday, August 20, 2007

Don’t think about it

Dijksterhuis and colleagues at the University of Amsterdam found that the unconscious intuition is better than conscious cogitation for some complex problems. I’ve wondered for a while whether Halford & Co’s finding that humans can process at most four independent variables simultaneously would change if one biased the test towards subconscious thinking. The Dijksterhuis work suggests that it might increase the number of processable variables. (See below for references.)

Dijksterhuis et al. (2006) hypothesized that decisions that require evaluating many factors may be better done by the sub-conscious. In one experiment, they asked volunteers to choose a car based on four attributes. This was easy to do, though the choice was constructed to be pretty easy. When subjects were then asked to think through a dozen attributes, they did no better than chance though. However, when distracted so that thinking took place subconsciously, they did much better. Conclusion: conscious thinkers were better able to make the best choice among simple products, whereas unconscious thinkers were better able to make the best choice among complex products.

Halford defines the complexity of a cognitive process as the number of interacting variables that must be represented in parallel to implement that the most complex step in a process (see Halford et al. 1998 for a review). He argues that relational complexity is a serviceable metric for conceptual complexity. Halford et al. (2005) found that a structure defined on four independent variables is at the limit of human processing capacity. Participants were asked to interpret graphically displayed statistical interactions. Results showed a significant decline in accuracy and speed of solution from three-way to four-way interactions; performance on a five-way interaction was at a chance level.

The Amsterdam experiment wasn’t equivalent, because the variables weren’t independent – it seems the decision was a matter of counting the number of positive attributes in the case of car choice. (One car was characterized by 75% positive attributes, two by 50% positive attributes, and one by 25% positive attributes.) Dijksterhuis et al. define complexity as “the amount of information a choice involves;” more attributes therefore means higher complexity. I don’t know how to map between the Dijksterhuis and Halford complexity metrics.

Still, I’ve wondered about what might happen if subjects had to only seconds to guess at the graphical comparisons used in Halford et al. (1998), rather than finding the answer by deliberation. If they were given Right/Wrong feedback as they went, they might intuitively learn how to guess the answer. (I’m thinking of something like this simulation of the Monty Hall game.) If this were the case, it could undermine my claims about the innate limits to software innovation for large pieces of code (or projects in general) with large numbers of independent variables.

References

Ap Dijksterhuis, Maarten W. Bos, Loran F. Nordgren, and Rick B. van Baaren (2006)On Making the Right Choice: The Deliberation-Without-Attention Effect
Science 17 February 2006 311: 1005-1007 [DOI: 10.1126/science.1121629]

Graeme S. Halford, Rosemary Baker, Julie E. McCredden, John D. Bain (2005)
How Many Variables Can Humans Process? (experiment)
Psychological Science 16 (1), 70–76

G. S. Halford, W. H. Wilson, & S. Phillips, (1998)
Processing capacity defined by relational complexity: Implications for comparative, developmental, and cognitive psychology (review article)
Behavioral and Brain Sciences, 21, 803–831.

Thursday, August 16, 2007

Music: back to tangibles

The Economist sums up a story on the record labels’ new business model with an Edgar Bronfman quote: “The music industry is growing. The record industry is not growing.”

It seems the labels have decided that they need a cut of more than just a band’s CD sales; their new contracts include live music, merchandise, and endorsement deals.

Just as the old instincts for relationships and reality have driven the pet industry to generate more revenue than media (see my post Animal Instincts) tangibles are reasserting themselves in the music industry. The Economist, citing the Music Managers Forum trade group, reports that seven years ago, record-label musicians derived two-thirds of their income from pre-recorded music, with the other one-third coming from concert tours, merchandise and endorsements. Those proportions have been now been reversed. For example, concert-ticket sales in North America alone increased from $1.7 billion in 2000 to over $3.1 billion last year, according to Pollstar, a trade magazine.

Sunday, August 12, 2007

Animal Instincts

American spend $41 billion per year on their pets according to a feature article in BusinessWeek. That’s about $400 per household, and more than the GDP of all but 64 countries in the world.

It’s good to know that the old hard-wired priorities – relationships, even with animals, and schtuff – still trump the new intangibles. According to BW, the yearly cost of buying, feeding, and caring for pets is more than the combined sum of what Americans spend on the movies ($10.8 billion), playing video games ($11.6 billion), and listening to recorded music ($10.6 billion).

Pet care is the second-fastest growing retail sector after consumer electronics. But the intangible economy is unavoidable even here: services like ‘pet hotels’ (kennels, to you and me), grooming, training, and in-store hospitals, have helped PetSmart expand its service business from essentially nothing in 2000 to $450 million, or 10% of overall sales, this year.

(It seems that pets are now more popular as companions for empty-nesters, single professionals and DINKYs, than as kids’ sidekicks. With this kind of infantilization, will it be long before more grown-ups start admitting to still sleeping with teddy bears?)

Saturday, August 11, 2007

The mortgage mess as a cognitive problem

The sub-prime mortgage debacle is a problem of cognitive complexity. A lack of understanding of the risks entailed by deeply nested loan relationships is leading to a lack of trust in the markets, and this uncertainty is leading to a sell-off. More transparency will help – but has its limits.

A story on NPR quotes Lars Christensen, an economist at Danske Bank, as saying that there is no trust in market because of the unknown complexity of the transactions involved. (Adam Davidson, “U.S. Mortgage Market Woes Spread to Europe,” All Things Considered, Aug 10th, 2007; more was broadcast than seems to be in the online version.)

This is a ‘hard intangibles’ problem: intricate chains and bundles of debt arise because there’s no physical limit on the number of times these abstractions can be recomposed and layered, with banks lending to other banks based bundles of bundled loans as collateral. When questions arise about the solvency of one of the root borrowers, it’s in large part because there’s no transparency into what they’re holding. According to the Economist (“Crunch time” Aug 9th 2007), complex, off-balance sheet financial instruments were the catalyst for the market sell-off. Phrases like “investors have begun to worry about where else such problems are likely to crop up” suggest that lack of understanding is driving uncertainty. The entire market is frozen in place, like soldiers in a mine field: one bomb has gone off, but no-one knows where the next is buried.

One of the drivers of the problem, according to the Financial Times (Paul J Davies, “Who is next to catch subprime flu?” Aug 9th 2007), is that low interest rates have propelled investors into riskier and more complex securities that pay a higher yield. “Complexity” is a way of saying that few if any analysts truly understand the inter-relationships among these instruments. The market is facilitated by the use of sophisticated models (i.e. computer programs) that predict the probabilities of default among borrowers, given the convoluted structure of asset-backed bonds. As the crisis has evolved, banks have come to realize – again, suggesting that this was not immediately obvious – that they’re exposed to all forms of credit markets, to more forms of credit risk than they thought.

This suggests a policy response to a world of hard intangibles: enforced transparency. The US stock market, the most advanced in the world, has of necessity evolved to be more transparent in terms of disclosure requirements on company operations than its imitators. According the FT, the Bundesbank is telling all the German institutions to put everything related to sub-prime problems on the table – indicating that increased visibility for the market will improve matters.

It’s striking how little the banks seem to know. BusinessWeek quotes an economist at CalTech as recommending that the Federal Reserve insists firms rapidly evaluate their portfolios to determine exactly how much of the toxic, risky investments they hold – by implication, they don’t know. (Ben Steverman, “Markets: Keeping the Bears at Bay” Aug 10th 2007.) The situation summed up well here:

“The problem here is that the financial industry has created a raft of new, so-called innovative debt products that are hard, even in the best of times, to place an accurate value on. "You don't have the transparency that exists with exchange products," the second-by-second adjustment in a stock price, for example, says Brad Bailey, a senior analyst at the Aite Group. The products are so complex that many investors might have bought them without realizing how risky they are, he says.”

Transparency may be a useful solvent for governance problems in all complex situations. For example, in an unpublished draft paper on network neutrality, analysts at RAND Europe recommend that access to content on next-generation networks be primarily enforced via reporting requirements on network operators, e.g. requiring service providers to inform consumers about the choices they are making when they sign up for a service – one of the keystones of the Internet Freedoms advocated by the FCC under Michael Powell. This may be one of the only ways to provide some degree of management of the modularized value mesh of today’s communications services.

However, transparency as a solution is limited by the degree to which humans can make sense of the information that is made available. If a structure is more complex than we can grasp, then there are limits to the benefits of knowing its details. A hesitate to draw the corollary: that limits should be imposed on the complexity of the intangible structures we create.

Friday, August 10, 2007

Wang Wei

Wang Wei is one of China’s greatest poets (and painters), the creator of small, evocative landscape poems that are steeped in tranquility and sadness. More than that, though: while the poems are about solitude rooted in a serious practice of ch’an meditation, Wang worked diligently as a senior civil servant all his life.

I found his work via review of Jane Hirshfield’s wonderful collection After. David Hinton’s selection of Wang Wei poems is beautifully wrought. The book is carefully designed, and the Introduction and Notes are very helpful.

Wang’s poetry gives no hint of the daily bureaucracy that he must have dealt with. It’s anchored in his hermitage in the Whole-South Mountain, which was a few hours from the capital city where he worked. Wang was born in the early Tang dynasty to one of the leading families. He had a successful civil service career, passing the entrance exam at the young age of 21 and eventually serving as chancellor. His wife died when he was 29, at which point he established a monastery on his estate. He died at the age of 60.

Deer Park is one of his most famous poems. Here’s Hinton’s translation:

No one seen. Among empty mountains,
hints of drifting voice, faint, no more.

Entering these deep woods, late sunlight
flares on green moss again, and rises.

(For more translations, see here and here.)

Wang’s work shows that one can make spiritual progress while also participating in society. One does not have to give up the daily life and become a monk, though it’s surely important that one’s mundane activities make a contribution to the good of society.

Thursday, August 09, 2007

Factoid: 19 million programmers by 2010

According to Evans Data Corp, the global developer population will approach 19 million in 2010. (I found this via ZDNet's ITFacts blog; the EDC site requires registration to even see the press release.) That's quite a big number - the total population of Australia, for example.

Programming will not be a marginal activity, and any fundamental cognitive constraints on our ability to develop increasingly complex problems will be impossible to avoid.

A lot of the growth will come from new countries bringing programmers online: EDC forecasts that the developer population in APAC will grow by 83% increase from 2006 to 2010, compared to just a 15% increase in North America for the same period. This will keep the skill level high, since only very talented people will enter the population, rather than expanding the percentage of the programming population - and thus reducing average skill - in a given country.

Therefore, the qualititative problems of programming won't change much in the next 5-10 years. However, beyond that we may also face the issue of reducing innate skill levels of programmers.

Sunday, July 22, 2007

IT Project Success: Getting Better, but Big is still Bad

The biennial “Chaos Report” on IT project success from The Standish Group reports that the success/failure ratio flipped between 1994 and 2006. In 1994 the ratio for “flat failures” vs. “complete successes” was a depressing 31% vs. 16%; in 2006 it was a more encouraging 19% vs. 35%. (The work is reported in CIO; the Standish Group web site is remarkably sullen, and doesn’t seem to have any press releases, let alone publicly available recent data.)

On page 2 of the CIO story, the Standish CEO says: “Seventy-three percent of projects with labor cost of less than $750,000 succeed. . . . But only 3 percent of projects a with labor cost of over $10 million succeed. I would venture to say the 3 percent that succeed succeeded because they overestimated their budget, not because they were managed properly.” A $750,000 project is pretty tiny: six developers for six months, at $250,000/developer/year fully loaded. Even a $10 million project is 20 developers for two years.

This result matches received wisdom that large projects are more likely to fail, which I attribute at least in part to the cognitive challenge of wrapping one’s head around large problems.

What should one do about it? It implies that smaller projects are the only way to go – but what if one has ambitious goals? If it’s true that one can construct complex solutions out of many small, simple parts, everything’s fine. But I’m deeply suspicious of the “divide and conquer” or “linearization” assumption. There are many important problems that just can’t be broken up, from inverting a matrix to simulating non-linear systems.

This may be a cultural reality check: many ambitious goals may simply not be achievable. Humility may be the best way to ensure success. I doubt politicians and business executives want to hear this. Trying to fly too high brought Icarus down – exactly as his engineer-father Daedalus had warned.

And things may not get better: as technology progresses, the complexities of our systems will grow, and linear solutions become even less useful. As the interconnectedness and intangibility of society grows, we may have to become more humble, not more bold, because that will be the only way to get stuff done. It’s counter-intuitive that as technology progresses we need to become less, not more, ambitious, but that may be the way things work with the new intangibles.

NOTES

My thanks to Henry Yuen for referring this story.

I have some reservations about the Standish data. It’s proprietary, and there are academics who’ve questioned it for years. CIO provides some background on Chaos Report and its methods in an interview with the CEO; it also summarizes questions about their method. One has to wonder how the sample population has changed over the years. If the number of small projects in the sample has grown over time, then success reported above would increase simply because smaller projects fail less often, not because project management performance has improved.

Tuesday, July 17, 2007

Business: a City, not an Ecosystem

Geoffrey West’s work on scaling in cities provides ammunition for my critique of the “business ecosystem” analogy (Ecosystem alert, Eco mumbo jumbo). New Scientist reports on a recent paper by West and co-workers which found that some urbanization processes differ dramatically from biological ones (references below).

Describing the city as an organism is a much-loved metaphor; New Scientist quotes Frank Lloyd Wright waxing lyrical about “thousands of acres of cellular tissue . . . enmeshed by an intricate network of veins and arteries.”

We like to think that cities work like biological entities, just as we like to think that industries work like networks of organisms. But West & Co’s work indicates that the analogy is flawed. As animals get larger, their metabolism slows down. This is true in some respects for cities, but in others the opposite is true. Infrastructure metrics, like the numbers of gas stations and miles of paved roads scale like biological ones: the amount grows less slowly than the size of the city. But for the things that really count, things speed up. For example, measures of wealth creation and innovation - the number of patents, total wages, GDP - grow more rapidly with size. Bigger cities have a faster metabolism than smaller ones, unlike animals.

Industries resemble cities more than they resemble ecosystems. Increasing returns with size and non-zero sums are key characteristics that are found in both cities and industries, but not biological systems. “Business is Urbanism” is a more accurate and productive metaphor than “Business is an Ecosystem”.

P.S. While we’re talking about ecosystems... The very notion of ecosystem is, of course, itself a metaphor: Nature is a System. The American Heritage Dictionary defines a system as “A group of interacting, interrelated, or interdependent elements forming a complex whole.” It is presumed that there is an observable whole; that it can be broken down into elements; and that the elements interact. So when people use the Business is an Ecosystem metaphor, I think what they’re really doing is simply using Business is a System, and cloaking it with the numinous mantle of Nature (cf. Ecosystem alert).

References

Dana Mackenzie, Ideas: the lifeblood of cities, New Scientist, 23 May 2007 (subscription wall)

Bettencourt, Lobo, Helbing, Kuhnert & West, Growth, innovation, scaling, and the pace of life in cities, Proceedings of the National Academy of Sciences, vol 104, p 7301, 24 April 2007

Sunday, July 15, 2007

When physicists see a power law, they think in terms of phase transitions, and they smell Nobel prizes. They are like sharks with blood in the water

--- Steven Strogatz on the fuss about scaling laws, quoted in New Scientist, Ideas: the lifeblood of cities, 23 May 2007

In context, from New Scientist:

During the 20th century, many researchers studying urban growth focused on economies of scale and their effect on wages. In 1974, Vernon Henderson of Brown University in Rhode Island proposed that cities reach an optimal size by growing until their workers' per capita income reaches a maximum; when it starts to decline, workers leave for other cities. More recently, researchers including West have tried to identify deeper mechanisms behind these societal patterns. Though West is a physicist by training, his reputation stems mostly from his pioneering and controversial work on scaling laws in biology - how things change with size.

What is all the fuss about scaling laws? "Physicists are used to thinking about extremely large systems of identical particles," says Steven Strogatz, a mathematician at Cornell University in Ithaca, New York. Take a piece of iron: at high temperatures, the spins of the particles jiggle around in random directions. If you gradually lower the temperature, the spins stay random until you reach a critical point - then they suddenly line up, and you have a ferromagnet.

This switch from disorder to order is called a phase transition. In the 1960s, physicists noticed that phase transitions follow certain universal patterns, called power laws, even if they have nothing in common physically. Kenneth Wilson of Cornell showed in the 1970s that these power laws come about through the growth of fractal structures, work which won him the Nobel prize in 1982. Since then, Strogatz says, "When physicists see a power law, they think in terms of phase transitions, and they smell Nobel prizes. They are like sharks with blood in the water."

Not that weird

Peter Pitsch’s The Innovation Age (1996) made me question something I’ve taken for granted: that complexity and uncertainty in the economy is growing, and doing so at an unprecedented rate. Pitsch’s book is based on this premise, and it made me wonder: what is the evidence?

The number of industry players who are inter-connected may be growing due to the Internet and cheap global travel, but an individual companies is not necessarily directly connected to more counterparts. It’s a bigger graph, but when one looks at individual nodes, the connectivity is much as it has always been.

Uncertainty isn’t new, either. Pitsch mentions the late Middle Ages as a tumultuous period that produced amazing innovation, and the Industrial Revolution was similar. The uncertainty in aggregate may be larger today, but so is the world population; has the normalized per-capita uncertainty grown? A reasonable measure might be stock market volatility. Schwert’s data for the 1859 – 1987 period doesn’t show any trends I can see with the naked eye (G.W. Schwert, “Why Does Stock Market Volatility Change Over Time," Journal of Finance, vol. 44, pp. 1115-1153, 1989). Market uncertainty, at least, is much the same.

I now believe that this is the Special Present Fallacy at play again. We’re always biased to see the present moment as exceptional; the odds are that it’s not.

Thursday, July 12, 2007

Factoids: cost of clinical trials

According to Thomson CenterWatch, a publishing company that focuses on the clinical trials industry, companies need to recruit about 4,000 people to test an experimental drug at a cost of up to $25,000 for each person. That translates into $100 million at the high end.

Source: Advertorial on Clinical Trials in New Scientist, 16 June 2007

Factoid: A typical cellphone user spends 80% of his or her time communicating with just four other people

--- Source: Stefana Broadbent, an anthropologist who leads the User Adoption Lab at Swisscom, cited by the Economist in Tech Quaterly story on June 9, 2007: Home truths about telecoms.

They also quote her thus: "The most fascinating discovery I've made this year is a flattening in voice communication and an increase in written channels. . . . Users are showing a growing preference for semi-synchronous writing over synchronous voice." The Economist's gloss: "Her research in Switzerland and France found that even when people are given unlimited cheap or free calls, the number and length of calls does not increase significantly. This may be because there is only so much time you can spend talking; and when you are on the phone it is harder to do other things. Written channels such as e-mail, text-messaging and IM, by contrast, are discreet and allow contact to be continuous during the day."

It seems writing really is a useful alternative channel. I guess there's a reason why the Blackberry was so succesful.

Sunday, July 08, 2007

Ecosystem alert

When you see references to ecosystems in a business story, raise the shields. Someone is trying to mess with your mind.

The current Business Week has two good examples. An adulatory story about the "Apple ecosystem" (Welcome to Planet Apple, which ran as Welcome to Apple World in hard copy) describes how the company has built its network of partners. Implicit is Iansiti and Levien's notion that the most influential companies are "keystone species in an ecosystem." As I argued in Eco mumbo jumbo, the analogy is flawed in a long list of ways. For example: species don't choose to be keystones; companies interact vountarily, but one organism consumes another against its will; and biological systems have neither goals nor external regulators, whereas industries have both.

The ecosystem analogy is used unthinkingly in this story, judging by the hodgpodge of other metaphors that are used: "[the] ecosystem has morphed from a sad little high-tech shtetl into a global empire," "[its] new flock of partners," "a gated, elitist community," "the insular world of the Mac," "the Apple orchard . . . is still no Eden." Note, though, that most of them refer to places, with a nod to nature.

To get a sense of what's really going on when the ecosystem metaphor is used, let's look at another story, Look Who's Fighting Patent Reform. Computing companies have been pushing for patent reform on Capitol Hill, but "[t]he past few weeks have brought an unexpected surge of opposition from what one lobbyist calls the 'innovation ecosystem'—a sprawling network of entrepreneurs, venture capitalists, trade groups, drug and medical equipment manufacturers, engineering societies, and research universities." It's a term used by the special pleader. The only substantive resemblance to an ecosystem is that these groups connect to each other in network. The rhetorical benefit, though, is to invoke the commonplace Nature Is Good. Nature is unspoiled, bountiful, self-regulating: the antithesis of concrete-covered recklessly-regulating partisan politicking. Nature is a metaphor that appeals to both sides of the political divide: it's organic, but competitive; it's an inter-related, but dynamic; it's nurturing, but stern in its consequences. It's therefore ideal when trying to put a halo around an otherwise unsympathetic subject.

Friday, July 06, 2007

Trading on News

Follow the money, if you want to know where the action is in AI (and most other things). Trading houses are buying tagged news feeds so that they can process them as input for algorithmic trading. That'sll have to be a pretty smart news reader.

The Economist story that reports this development contains these factoids:

Algorithmic trading accounts for a third of all share trades in America
The Aite Group reckons it will make up more than half the share volumes and a fifth of options trades by 2010
The new London Stock Exchange system catering to the growth of algorithmic trading cuts trading times down to ten milliseconds; on its first day, it processed up to 1,500 orders a second, compared with 600 using its previous system

The story closes with the observation that "the news may come from reading the algorithmic trades, not the other way around," because these systems may be able to spot early price signals of a takeover decision before it's announced.

Taken a step further, I can imagine the trading houses selling re-tagged news feeds back to Dow Jones and Reuters. Suitably anonimized, aggregated, and delayed to protect the individual movers, information on which news items triggered trades would be useful at second order. And then the news providers can sell the re-re-tagged feeds to the traders, who'll then sell back the re-re-re-tagged . . .

Tuesday, July 03, 2007

Factoid: Americans spent only 0.2% of their money but 10% of their time on the internet

Source: Austan Goolsbee and Peter Klewnow, "Valuing Consumer Products by the Time Spent Using Them: An Application to the Internet," draft available at http://faculty.chicagogsb.edu/austan.goolsbee/research/timeuse.pdf. This result suggests that conventional consumer surplus calculations significantly understate the value of the internet.

Paper abstract:

For some goods, the main cost of buying the product is not the price but rather the time it takes to use them. Only about 0.2% of consumer spending in the U.S., for example, went for Internet access in 2004 yet time use data indicates that people spent around 10% of their entire leisure time going online. For goods like that, estimating price elasticities with expenditure data can be difficult and, therefore, estimated welfare gains highly uncertain. We show that for time-intensive goods like the Internet, a simple model in which both expenditure and time contribute to consumption can be used to estimate the consumer gains to a good using just the data on time use and the opportunity cost of people's time (i.e., the wage). The theory predicts that higher wage internet subscribers should spend less time online (for non-work reasons) and the degree to which that is true determines the elasticity of demand. Based on expenditure and time use data and our elasticity estimate, we calculate that consumer surplus from the Internet may be around 2% of full-income, or several thousand dollars. This is an order of magnitude larger than what one obtains from a back-of-the-envelope calculation using data from expenditures.

Monday, July 02, 2007

The Known, the Unknown, the Unknowable

Thanks to a acknowledgement in Nassim Taleb's The Black Swan, I've found this fascinating program on The Known, Unknown, and Unknowable run by Jesse Ausubel of the Sloan Foundation.

Gomory outlines this vision in a short essay, published in Scientific American in 1995. He believes that the artificial is simpler, and thus more predictable, than the natural. However, he notes: "Large pieces of software, as they are expanded and amended, can develop a degree of complexity reminiscent of natural objects, and they can and do behave in disturbing and unpredictable ways."

The program led to a workshop at Columbia in 2000; I look forward to digging in to the proceedings. One of the papers that caught my eye was Ecosystems and the Biosphere as Complex Adaptive Systems by Simon Levin. It seems an open question whether evolution increases the resiliency of ecosystems or leads to criticality - a very important matter for business people hoping that aping ecosystems will improve stability!

Thursday, June 28, 2007

In the last fifty years, the ten most extreme days in the financial markets represent half the returns.

Factoid (verbatim) from Nassim Nicholas Taleb's The Black Swan: The Impact of the Highly Improbable, Random House 2007, p. 275.

In the caption to Figure 14 on the following page, which illustrates S&P 500 returns, Taleb observes: "This is only one of many such tests. While it is quite convincing on a casual read, there are many more-convincing ones from a mathematical standpoint, such as the incidence of 10 sigma events."

(This post marks the end of my attempt to use MSN Spaces for my Factoids site. It was just too clunky. I'll now post factoids here; as they pile up, you'll find all of them by clicking on the "factoids" label at the end of the post.)

Wednesday, June 27, 2007

Eco mumbo jumbo

I’m coming to the conclusion that the “business ecosystem” metaphor is nonsense. That’s a pity, since I speculated in Tweaking the Web Metaphor that the food web might be a useful metaphor for the internet, conceived as a “module ecosystem.” [1] Bugs in the business ecosystem mapping would be even more unfortunate for people who’ve made strategic business decisions on the basis of this flawed metaphor.

“Business is an ecosystem” is an analogy, and like any argument from analogy it is valid to the extent that the essential similarities of the two concepts are greater than the essential differences. I will try to show (at too much length for a blog post, I know...) that the differences are much greater than the similarities.

This biological analogy is very popular. A Google search on "business ecosystem" yielded about 154,000 hits, "software ecosystem" gave 76,000 hits (Microsoft’s in 47,200 of them), and “computing ecosystem” 18,000 hits.

So where’s the problem?

Let me count the ways.

1. A biological ecosystem is analyzed in terms of species, each of which represents thousands or millions of organisms. Business ecosystems are described in terms of firms: just one of each. A food web of species summarizes billions of interactions among interactions; a business web of companies is simply the interaction among the firms studied.

2. Species are connected, primarily, by flows of energy and nutrients. A is eaten by B is eaten by C is eaten by D, etc. Energy is lost as heat at every step. In the business system, a link primarily represents company B buying something from company A. Goods flow from A to B, and money flows back. Both A and B gain, otherwise they wouldn’t’ve entered into the transaction. Therefore, the system isn’t lossy, as it is in a food web. In fact, gains from trade suggest that specialization leading to more interacting firms leads to more value. The links between companies could also stand for co-marketing ventures relationships, technology sharing and licensing agreements, collusion, cross shareholding, etc; however, these have the same non-zero sum characteristics as monetary exchange does.

3. One might sidestep these problems by claiming that species are mapped to firms, and individual animals are mapped to the products that a firm sells. That solves the multiplicity mismatch in #1, and, if one just considers the material content of products, the entropy problem in #2. However, two problems remain. First, the value of products is mostly in the knowledge they embody, not their matter; knowledge (aka value add) is created at every step, the inverse of what happens with the 2nd Law of Thermodynamics. Second, companies sell many diverse products. The fudge only works if a species is mapped to a product unit (in fact, to the part of a product unit that produces a single SKU), rather than to a firm.

4. Species change slowly, and their role in an ecosystem changes very slowly; on the other hand, companies can change their role in the blink of an eye through merger, acquisition or divestiture. Interactions between firms can be changed by contract, whereas that between species is not negotiable except perhaps over very long time scales by evolution of defensive strategies).

5. Biological systems are unstable; the driving force of ecological succession is catastrophe. [2] Businesses seek stability, and the biological metaphor is used as a source of techniques to increase resilience; see e.g. Iansiti and Levien’s claim that keystone species lead to increase stability in an ecosystem. [3], [4] If one seeks stability, biological systems are not a good place to look.

6. Biological systems don’t have goals, but human ones do. There are no regulatory systems external to ecosystems, but many, such as rule of law and anti-trust, in human markets. Natural processes don’t care about equity or justice, but societies do, and impose them on business systems. If ecosystems were a good model for business networks, there would be no need for anti-trust in markets.

7. End-consumers are not represented at all in the “business ecosystem” model. Von Hippel and others [5] who study collaborative innovation could be seen to be pointing to customers - or at least some of them - as a node in the business ecosystem, but the same problems about singularity/multiplicity noted above applies here.

8. Companies are exhorted to invest in their ecosystem if they want to keystone species. Keystone species don’t necessarily (or usually) represent a lot of biomass, so it’s not clear why a firm would want to be a keystone. (And of course, the metaphor leaves unstated whether biomass maps to total revenue, profitability, return on investment, or something else.) More generally: being a keystone species isn’t a matter of choice for the animal concerned; the keystone relationship arises from the interactions among species as a matter of course.

The business ecosystem metaphor in use

Iansiti and Levien are high profile proponents of business ecosystems. [3] [4] In The Keystone Advantage, they motivate the analogy between business networks and biological ecosystems by arguing that both are “formed by large, loosely connected networks of entities”, both “interact with each other in complex ways”, and that “[f]irms and species are therefore simultaneously influenced by their internal capabilities and by their complex interactions with the rest of the ecosystem.” They state that the key analogy they draw “is between the characteristic behavior of a species in an ecosystem and the operating strategy of a strategic business unit.” They declare the stakes when they continue: “To the extent that the comparison of business units to ecosystems [I presume they mean “to species”] is a valid one, it suggests that some of the lessons from biological networks can fruitfully be applied to business networks.”

To caricature their argument: Ecosystems are networks; business networks are networks; therefore business networks are ecosystems. Hmmm...

They were more circumspect in the papers that preceded the book, where they try to dodge the weakness of the analogy that underpins their argument by contending that they don’t mean it: “[W]e are not arguing here that industries are ecosystems or even that it makes sense to organize them as if they were, but that biological ecosystems serve both as a source of vivid and useful terminology as well as a providing some specific and powerful insights into the different roles played by firms” ([4], footnote 10). They want to have it both ways: get the rhetorical boost of a powerful biological metaphor, but avoid dealing with parts of the mapping that are inaccurate or misleading. As they noted in their book: If industries cannot be compared to ecosystems, then their insights cannot be validated by the analogy. However, they do attempt a mapping. For example, they attempt to answer the question “What makes a healthy business ecosystem?” by examining ecosystem phenomena like (1) hubs which are said to account for the fundamental robustness of nature’s webs, (2) robustness measured by survival rates in a given ecosystem, (3) productivity of an ecosystem analogized to total factor productivity analysis in economics, and (4) niche creation.

In most if not all cases, the appeal to ecosystem is very superficial; no substantive analogy is drawn. For example, Messerschmitt and Szyperski’s book [6], which made it into softcover, is entitled Software Ecosystem, but its remit seems to be simply to examine software “in the context of its users, its developers, its buyers and sellers, its operators, society and government, lawyers, and economics”; the word ecosystem doesn’t even appear in the index. (Disclaimer: I haven’t read the book.) The word ecosystem is simply meant to evoke a community of interdependent actors, with no reference beyond that to dynamics or behavior.

A more generic flaw with the business ecosystem metaphor is that most people are more familiar with businesses than with ecosystems. Successful metaphors usually explain complex or less-known things in terms of simpler, more familiar ones. Shall I compare thee to a Summer's day? The rhetorical appeal of the business ecosystem analogy must lie beyond its rather weak ability to make domesticate a strange idea.

Why do careful scholars resort to the ecosystem metaphor in spite of its obvious flaws? The image of nature is so powerful that it is a symbol too potent to pass up. Nature represents The Good (at least in our culture at this time), and therefore an appeal to a natural order is a compelling argument if one’s claims bear some resemblance to what’s happening in nature. However, if nothing else, this reminds me of Hegel’s historicist cop-out that what is real is rational, and what is rational is real. Just because nature is constructed in a certain way doesn’t mean that industries should be.

Perhaps my standards for metaphors are too high. To me, a conceptual metaphor is a mapping one set of ideas to another; one has to take the “bad” elements of the mapping with the “good”. If the good outweighs the bad, and if the metaphor produces insight and new ideas, then the analogy has value. Others just take the “good” and simply ignore the “bad”. For me, a metaphor is a take-it-all-or-leave-it set menu, not something to pick from a la carte.

Tentative conclusion

Does this all matter? Yes, but I still have to work out the details. For now I just claim that the weakness of the mapping between biological and business systems means that any argument that one might make about the goodness of “business ecosystems” in general and “keystone species” in particular is potentially misleading. It could delude firms into make unsound investments, e.g. in “building ecosystems,” and lead policy makers into dangerous judgments.

Notes

[1] The module ecosystem differs from the business ecosystem in that species, the nodes of the food web, are mapped to functional modules, rather than to individual companies. However, the glaring weaknesses of the business ecosystem metaphor undermine my confidence in the whole approach.

[2] John Harte, in “Business as a Living System: The Value of Industrial Ecology A Roundtable Discussion,” California Management Review (Spring 2001), argues that the ecological sustainability practices under the banner of “industrial ecology” are worthy and important, but that they do not mimic the way natural ecosystems work. His ideas are reflected in items #2, #5 and #6. He also notes that human processes are much more efficient in using waste heat than natural ones are – photosynthesis is only about a half a percent efficient, whereas power plants at 30% are sixty times more efficient. Note, however, that Industrial ecology, defined as the proposition that industrial systems should be seen as closed-loop ecosystems where wastes of one process become inputs for another process (wikipedia, ISIE) differs from the business ecosystem idea as I treat it here, i.e. that industry organization (regardless of ecological impact) can be understood as an ecosystem.

[3] Marco Iansiti and Roy Levien, “The New Operational Dynamics of Business Ecosystems: Implications for Policy, Operations and Technology Strategy,” Harvard Business School Working Paper 03-030 (2003)

[4] Marco Iansiti and Roy Levien, The Keystone Advantage: What the New Dynamics of Business Ecosystems Mean for Strategy, Innovation, and Sustainability, Harvard Business School Press, 2004

[5] Eric von Hippel, Democratizing Innovation (2005), and e.g. Charles Leadbeater

[6] David Messerschmitt and Clemens Szyperski, Software Ecosystem – Understanding an Indispensable Technology and Industry, MIT Press, 2003

A perspective on persistence

A wonderful fragment from the poem Seventeen Pebbles, from Jane Hirshfield's 2006 collection After: Poems (p. 61)

For days a fly travelled loudly
from window to window,
until at last it landed on one I could open.
It left without thanks or glancing back,
believing only - quite correctly - in its own persistence.

Thursday, June 14, 2007

The Narrative Fallacy, Data Compression, and Counting Characters

I’m very grateful to Tren Griffin and Pierre-Yves Saintoyant for independently suggesting that I read Nassim Nicholas Taleb’s “The Black Swan: The Impact of the Highly Improbable.” Both must’ve realized how relevant his thinking is to my exploration of hard intangibles. At times it felt as if the book was written with me in mind.

One of the human failings that Taleb warns against – “rails against” might be more accurate – is the narrative fallacy. He argues that our inclination to narrate derives from the constraints on information retrieval (Chapter 6, “The Narrative Fallacy”, p. 68-9). He notes three problems: information is costly to obtain, costly to store, and costly to manipulate and retrieve. He notes, “With so many brain cells – one hundred billion (and counting) – the attic is quite large, so the difficulties probably do not arise from storage-capacity limitations, but may just be indexing problems. . .” He then goes on to argue that narrative is a useful form of information compression.

I’m not sure what Taleb means by “indexing”, but I suspect that the compression is required to extracting meaning, not the raw information. It’s true that stories provide a useful retrieval frame; since there’s only a limited number of them, we can perhaps first remember the base story, and then the variation. However, the long-term storage capacity of the brain seems to be essentially unbounded. What’s limited is our ability to manipulate variables in short-term; according to Halford et al, we can handle only about four concurrent items.

Joseph Cambell reportedly claimed that there were seven basic plots, an idea elaborated by Christopher Booker in The Seven Basic Plots; see here for a summary. The number is pretty arbitrary; Cecil Adams reports on a variety of plot counts, between one and sixty-nine. While the number of “basic plots” is arbitrary, the number of key relationships is probably more constant. I’m going to have to get hold of Booker’s book to check out this hypothesis, but in the meantime, a blog post by JL Lockett about 36 Basic Plots lists the main characters; there are typically three of them, or sometimes four. Now, there are many more characters in most plays and novels – but the number of them interacting at any given time is also around four.

I think one might even be able to separate out the data storage from the relationship storage limits: {stories} x {variations} allows one to remember many more narratives than simple {stories}, but I expect that the number of relationships in a given instance of {stories} x {variations} will be no greater that that in a given story.

More generally: if making meaning is a function of juggling relationships (cf. semiotics), a limit on the number of concurrent relationships our brains can handle represents a limit on our ability to find meaning in the world.

Friday, June 08, 2007

Tweaking the Web Metaphor

Today’s modular Internet needs a metaphor make-over. The “silo” and “layer” frameworks that have guided are no longer adequate. It’s time to reinvent a well-worn metaphor: the Web as a web. [1], [2]

The silo model divided up the communications business by end-user experiences like telephony, cable and broadcast television, assuming that each experience has its own infrastructure. The distinct experiences with unique public policy aspects remain, but the silos are growing together at the infrastructure level since all media are now moved around as TCP/IP packet flows. The layer model reflects this integration of different media all using the same protocols. It’s relevant when one takes an infrastructure perspective, but doesn’t take into account the very real differences between, say, real-time voice chat, blogs, and digital video feeds. One might say that the silo model works best “at the top” and a layer model “at the bottom”; in the middle, it’s a mess.

Time to revive the web metaphor, with a twist. The “web” of the World Wide Web refers to the network of pointers from one web page to another. [3] The nodes are pages, and the connections between them are hyperlinks. The “info-web” model I’m exploring here proposes a different mapping: the connections in the web represent information flow, not hyperlinks, and the nodes where they connect are not individual pages but rather functional categories, like blogs, social networking sites, search portals, and individuals.

Food webs

It’s a web as in an ecosystem food web, where the nodes are species and the links are flows of energy and nutrients. The simplest view is that of a food chain: in a Swedish lake, say, ospreys eat pike, which eat perch, which eat bleak, which eat freshwater shrimp, which eat phytoplankton, which get their energy from the sun via photosynthesis. A chain is a very simple model which shows only a linear path for energy and material transfer. (The Layers model resembles a food chain, where network components at one layer pass down communications traffic to the layer below for processing.)

A food web extends the food chain concept to a complex network of interactions. It takes into account aspects ignored in a chain, such as consumers which eat, and are eaten by, multiple species; parasites, and organisms that decompose others; and very big animals that eat very small ones (e.g. whales and plankton). The nodes in a food web are species, and the links between them represent one organism consuming another. While the nodes are multiply connected, there is some degree of hierarchy, since in an ecosystem there’s always a foundation species that harvests energy directly from non-organic sources, usually a plant using sunlight. Each successive organism is at a higher trophic level; first the phytoplankton, then the shrimp, then the bleak, etc.

Info-webs

In an eco-based web model for the Internet, the species in a bio-web are mapped to functionality modules as described in my earlier post, A modular net. For example, a YouTube video clip plugged into a MySpace page running on a Firefox browser on a Windows PC might correspond to the osprey, fish, shrimp, plankton in the simple example above. In the same way that there might be other predators beyond ospreys feeding on fish, there might be many other plug-ins on the MySpace page for IM, audio, etc.. In a bio-web, a link between species A and B means “A eats B”. In the info-web model of the Internet, a link means “information flows from A to B.” Value is added to information value in the nodes through processing (e.g. playing a video) or combinations (e.g. a mash-up). For example, a movie recommender embedded in Facebook gets its information from a database hosted somewhere else, and integrates into a user’s page. Therefore, information transport is key. One can think of the links as being many-stranded if there are many alternative ways of getting the relevant information across, or single-stranded if there’s only one or two communications options (e.g. for web search one can use Wi-Fi, 3G data, DSL, cable modem etc, but for high def video on demand there’s many fewer choices.)

The analogy between the info-web and the food web diverges when one considers what flows across the links. In the Internet, information flows around the web; in the biological case, it’s energy and nutrients. Information can be created at any stage in an information web and increases with each step, whereas energy is conserved, and available energy decreases as one moves up the trophic levels of an ecological pyramid. There is a sequence of “infotrophic levels” where information value is added at each step. However, since the amount of information grows with each step, the “information pyramid” is therefore inverted relative to the ecological one: it grows wider from the bottom to the top, rather than narrower.

Implications for policy making

The Internet is complex web of interlocking service, and is approaching the richness of simple biological ecosystems. In the same way that humans can’t control ecosystems, regulators cannot understand, let alone supervise, all the detailed interactions of the Internet. One may be able to understand the interactions at a local level, e.g. how IP-based voice communications plug into web services, but the system is too big to wrap one’s head around all the dynamics at the same time. [4] This is why a market-based approach is advisable. Markets are the best available way to optimize social systems by distributing decision making among many participants. Markets aren’t perfect, of course, and there are social imperatives like public safety and justice that need government intervention. The info-web model suggests ways to find leverage points where regulators should focus their attention, and also provides salutary lessons about the limits of the effectiveness of human ecosystem management.

For example, a keystone species is one that has a disproportionate effect on its environment relative to its abundance. Black-tailed prairie dogs are a keystone species of the prairie ecosystem; more than 200 other wildlife species have been observed on or near prairie dog colonies. Such an organism plays a role in its ecosystem that is analogous to the role of a keystone in an arch. An ecosystem may experience a dramatic shift if a keystone species is removed, even though that species was a small part of the ecosystem by measures of biomass or productivity. Regulators could apply leverage on “keystone species” rather than searching for bottlenecks or abuse of market power. This would provide a basis for both supportive and punitive action. At the moment search engines are “keystone species” – they play a vital role not only in connecting consumers with information, but also in generating revenue that feeds many business models. One might say that Google is the phytoplankton of the Internet Ocean, converting the light of user attention into the energy of money. Local Internet Service Providers may also be keystone species. In earlier phase of the net, portals were keystone species. Keystone services provide a point of leverage for regulators; they can wield disproportionate influence by controlling behavior of these services.

The unintended side effects of intervention in ecosystems stand as a warning to regulators to tread carefully. For example, the Christian Science Monitor reported recently on efforts to eradicate buffelgrass from the Sonoran Desert. It was introduced by government officials after the Dust Bowl in an attempt to hold the soil and provide feed for cattle. It’s unfortunately turned out to be an invasive weed that threatens the desert ecology, choking out native plants like the iconic saguaro cactus. Another example of biological control gone wrong is the introduction of the cane toad into Australia in 1935 to control two insect pests of sugar cane: it did not control the insects and the Cane Toad itself became an invasive species. By contrast, the release of myxomatosis in 1950 was successful in controlling feral rabbits in that country.

----- Notes -----

[1] Steven Johnson’s Discover essay, republished as “Why the web is like a rain forest” in The Best of Technology Writing, ed. Brendan Koerner, helped inspire this thinking.

[2] This is a rough first draft of ideas. There are still many gaps and ambiguities. The nature of the nodes is still vague: are they applications/services (LinkedIn is one node, Facebook is another), application categories (all kinds of social networking sites are one node), market segments, or something else? How and where does the end user fit in? How can one use this model to address questions of VOIP regulation, accessibility directives, culture quotas for video, and other hot topics in Internet policy? Much work remains to be done. The representation of transport services as links rather than nodes may change. The different in conservation laws needs to be worked out: sunlight, water and nutrients are limited and conserved in the web, rival resources, whereas information is non-rival and can be produced anywhere. Connections need to be made with prior work on metaphors for communications technologies, e.g. Susan Crawford’s Internet Think, Danny Hillis’s Knowledge Web, and Douglas Kellner’s “Metaphors and New Technologies: A Critical Analysis.”

[3] The word web derives from the Old Norse vefr, which is akin to weave. It thus refers to a fabric, or cloth. In many usages, e.g. food webs, there are assumed to be knots or nodes at the intersection of warp and weft, which occur in nets, but not in fabrics.

[4] This is a link to the Hard Intangibles problem more generally, via the limit (about four) on the number of independent variables that humans can process simultaneously.