During Thursday’s Forrester-sponsored Twitter conference (“Tweet Jam”) on “What Business Intelligence Is and Is Not” (search Twitter for #dmjam), I remarked that BI in general faces the problem that “people ask the wrong questions,” and that this will get worse with the impact of social media, applications, platforms, and patterns. I was challenged to provide some examples, but:

@jrep . @GregBonnette: @jrep Could you give example of “the wrong questions”#dmjam @lorita << have several, but in 140 characters? …

What I mean by “the wrong questions” is a problem that afflicts BI already, a peculiarly active form of the more general problem known as “confirmation bias,” that becomes more acute in the presence of “social media” dynamics.

You always come to BI with some sense of understanding how things work, some guesses at what you’re going to find. Often, your BI serves primarily to confirm your suspicions—and that’s a good thing, because it empowers you to move more forthrightly in the right direction. A problem arises, however, when your guesses and suspicions are actually wrong: because you frame the questions and interpret the answers, there’s still a strong tendency for the BI to confirm. Confirmation of your wrong guesses and misplaced suspicions is still going to empower you to move forthrightly, but in these cases, you’ll be moving in the wrong direction! Analysts generally understand this; decision-makers sometimes don’t. A critical part of the business analyst’s job is examining additional analyses that have the potential to disprove current guesses and suspicions. If they hold up, if you fail to disprove, great: you’re on the right track; you might not even show those results to the decision-maker. If you find holes, though, then even better: you’re saved from walking into a pit—or you will be, as soon as you show and explain them.

All that applies already, in “conventional BI.” But “social media” introduce a new problem: cooperative structures (like the OAuth system for delegated authentication) mean that some of the information you’d like is primarily in someone else’s hands. It may be that the protocol and partnerships allow you access to this extra information, but can you afford the performance and programming costs to collect it? If, as an analyst, you’ve routinely been looking at more data than you show the decision-makers (and you really should!), then you have a hard job ahead explaining why you suddenly need to spend so much more to collect information they’ve never seen! There’s a risk that you’ll only have access to the core information, the information that can only confirm guesses and suspicions that were already held at the time the system and partnerships were designed: you might lose richness of the data necessary for those counter-guess tests.

Some examples, since that was the challenge:

Imagine a company that offers a free download of a limited version of their product. For a variety of reasons, you institute a registration system for the downloads. It’s a common enough arrangement: it increases potential customer exposure to the product, it identifies better-qualified leads, and it provides great, instant feedback on the effectiveness of new releases and ad campaigns. You’re going to mine those registrations and downloads for all the BI you can squeeze out. Lots of obvious, guess-confirming questions, there. But here’s a guess-denying question that should be asked (hoping, of course, that the answer is “no”): do people have some other, non-registered way to download the same files? Because if they do, then a lot of them are going to find it. You need the info on registered downloads, but you also need the comparative info on downloads that bypass the registration. Someone who thinks they know the system is likely to assume—without thought—that there aren’t any unregistered downloads; someone else needs to wonder and check.

  1. 1 jrep

    Yes, the tensions of whether to enforce registration or not can be very real. Research is beginning to come in soundly refuting the … urban myth … that people don’t care about their privacy: quite the contrary, people seem to care *more*, and are becoming more and more resistant to overt privacy invasions (even if less and less able to think through the increasingly complex covert ones). If you depend on those registrations (whether for nuclear security, or just a few more sales), this reaction against is troubling.

  2. Nice post — have another scenario for you — a few months ago we were faced with a difficult question even for very experienced analysts and decision makers.

    I wrote a hypothetical use case for international anti-terrorism involving a hydro dam and nuclear facilities. We initially set it up as registration only, but in tracking web site visits for many days, very few were even willing to provide this minor amount of information, and disturbingly despite a summary — those who refused included multiple nuclear power plant domains, multiple IPs from DHS, DoD, State Department… while there wasn’t anything terribly revealing, we also did not necessarily want to plant a seed that didn’t already exist in the bad guys….

    While the individuals may have had their reasons, and we reached each of these organizations in other ways, the trend since the commercialization of the web is in many ways disturbing and self-destructive. The herd moves in mysterious ways!

    Bottom line is that I caved and made it available publicly — we determined for better or worse that we should error on the side of sharing, even though the probability is very high in this case that any value taken from the case will wind up in the hands of a competitor– no doubt a favored government contractor full of ex intel folks. That was prior to the underwear bomber incident — a few months later a report came out that very similar functionality was adopted between the U.S and EU….. so the pattern goes, and goes, and goes….

    Thanks for the post.

