Tag Archives: politics

Fortune Telling with Data: Modeling Threat with Feeble Predictors

Anna Pavlova - foxy chiromancerWhile in college, and unbeknownst to most people, I dabbled in some performance artistry and predictive analysis. Part of the performative nature of college is developing the capacity to claim competence in topics and credentials one has yet to earn, to educate formally while informally fumbling through social mechanics for which no adequate prerequisite is ever published, and finally to make elaborate promises to potential employers you can’t yet keep. Corroborating this farce with some documentation is usually expected, a cover letter here, a résumé there.

On professional applications, it seemed impressive to have a range of acronym memberships to organizations with undefined but assumed-legitimate titles. Interviewers, however, seemed inattentive to merit or documentation fluff, so in small print among some legit scholarships and volunteer positions, I wedged in a nod to my extra-curricular involvement in the “ESP Volunteer Aid Org.”

esp

I am not a performance artist or a seer and this was an experiment I dropped shortly after, but I think there’s some irony in that, while attempting to assemble my prospective professional qualifications, I spent some time considering a career in heightened sensory perception. Or rather, I tried to jest-test the merit system with an obviously bogus acronym.  Funny in retrospect, but previous experience in ESP isn’t far from the tacit prerequisite many would assign to data mungers these days, and particularly anyone who does even mild statistical analysis on crisis datasets. With all these data, surely someone would be able to claim clairvoyance, solve international crises with a affinity for computational analysis of historic precedent, surely the answers are there?

psychic

fortunetellerAs someone who works with data, and more importantly as someone currently living in the future (compared to those currently living in my home country…whoohoo Nairobi time), I thought it might be appropriate to exercise my clairvoyance and provide clarity on life in Nairobi, assessing some of the current intensities, and the probabilities they might aggravate. In any case, the explosion of state department travel warnings in my inbox this week has made my reticence on the subject a bit obnoxious so I’m going to diverge from my typical open source software soap-boxing to write a bit about the statistics of terrorism and the particulars of my current condition.

I write from the position of a math hobbyist,  and an amateur clairvoyant, and Electric Powerso I’ve peppered this post with some specious but thoughtful observations about the news I’ve been following, what I’m currently experiencing, and the links, images, and resources my limited bandwidth allows me to explore.

In brief, Nairobi has been intense of late. The state department issued 4 official warnings this week, encouraging US residents and visitors to avoid Eastleigh, travel to Mombasa, proximity to Burundi, and most recently, travel in Kenya, period. These precipitate from the bombing earlier this week in Nairobi (6 fatalities and 20+ injuries), the recent church attack in Mombasa (4 fatalities) and general insecurity about al-Shabaab and armed operatives threatening attacks in any country conducting peacekeeping and/or military efforts in Somalia. :(

For it be morrow.. Speculation about the probability of a “large scale attack to come” made me start thinking about the meaning of “large scale” and projected “imminence” when it comes to statistically predicting events of high variability. How large is a “large scale” event? Anything where multiple deaths result seems “large” to me, though my definition has adjusted to accommodate recent conditions. If authorities are projecting a large-scale event to come, what about the unsettling events of now? How soon is imminence, not to be too much of a Morrissey fan-girl, but how soon is now?

Which to choose?So with all of these questions and my own preoccupation with quantifying self, I thought it might be time to read up on a few predictive models of the likelihood that something might happen.

Periodically people people post comparatives online like “you’re __times likely to die of x than get involved in a terrorist attack.” These were net popularized post-9/11 it seems, though, the scale and impact of that attack in the domestic US was fairly singular and data collected prior to it would do little to predict its occurrence and re-occurrence without admission of several limiting factors and uncontrolled variables.  That said, in the Annals of Applied Science (vol. 7, no. 4 2013) last year, Clauset and Woodward wrote a paper called “Estimating the Historical and Future Probabilities of Large Terrorist Events” in which they hoped to define a generic statistical algorithm for estimating the likelihood of terror events in complex social systems. These kind of predictive stats depend on so many variables outside precedent empirical data but the authors present multiple tail models and disclaimers about the limitations of their predictions to control for this (Matlab code and sample data available here if you want to play).

Kreskin's ESP board game

Of particular interest is their summary forecast are estimates for 3 potential scenario probabilities based on possibly forecast from past data:

“Rather than make potentially overly specific predictions, we instead consider three rough scenarios (the future’s trajectory will presumably lay somewhere between): (i) an optimistic scenario, in which the average number of terrorist attacks worldwide per year returns to its 1998–2002 level, at about ⟨nyear⟩ = 400 annual events; (ii) a status quo scenario, where it remains at the 2007 level, at about 2000 annual events; and finally (iii) a pessimistic scenario, in which it increases to about 10,000 annual events.”

That then, looks something like this:

table
Crystalball looking cuteThe Clauset and Woodard analysis further predicts a range forecast of 19-46% chance that at least one catastrophic global event will take place in the next decade. But to localize this a bit, and for the sake of argument, let’s take their “status quo” (a modest median) probability calculation trained from the RAND-MIPT Terrorism Knowledge Base, and set up the conditional probability that there will be a terrorist happening of catastrophic proportion (p=0.461) while I am in Nairobi (approx. 30/(365*10yr) possible days; p=0.008) . The condition is fairly unlikely, but unfortunately increases when you factor in covariates like my general foreignness (> victim likelihood…bummer, p=0.475), and the  logic that the violence occurring with agglutinative regularity will likely foster additional conflict and escalated tension:
“For instance, international terrorist events, in which the attacker and target are from different countries, comprise 12% of the RAND-MIPT database and exhibit a much heavier-tailed distribution, with αˆ = 1.93 ± 0.04 and xˆmin = 1.”
EAC MapTrying to control for multiple variables is complicated, so even in a problem which can be structured as conditional (probability of x given n state) struggles in this scenario. Does the probability of one state affect the other and yet still require factoring in both? And if so, perhaps it’s a joint probability issue between independent events. When the prediction derives from historical information, perhaps a Bayesian use of prior probabilities could be trained for future forecasting but even then…complicated. And regardless, perhaps the historical data is limiting in applicability due to scope; my definition of “catastrophic” scales down to the mere injury of a family member/friend, decidedly distant from the catastrophic proportion of 9/11 or any event with Chiromancerupwards of 1000 fatalities used to make these kind of probabilistic predictions.
Most of the math here is beyond my own research level but one factor that strikes me as strange, given my work with crises in the context of maps, is the absence of particularly specific geo-data analysis. The East African Community (EAC) hasn’t been spared much violence in the past few years, and sadly, in the last few months in Kenya. so I’m interested in reading about statistical modeling done on conflict probabilities with geo-specificity. Maybe this a usecase for the Wolfram Language when I’m brought out of my beta-in-waiting status; something to counter the pop-y around the world travel time estimates and polar auto-opposite calculations that have been so fun but maybe not particularly applicable to my current situation. What might be applicable is a computational knowledge engine that would assess my IP address, map it to a lat/long and then calculate how far I should move in the city to avoid conflict on a daily basis (*winks* at Wolfram friends).
To be fair, the Clauset and Woodward research  honestly nods to the variables not-completely considered in their analysis:
“Technology, population, culture and geopolitics are believed to exhibit nonstationary dynamics and these likely play some role in event severities…our approach is nonspatial and says little about where the event might occurrefinements will likely require strong assumptions about many context-specific factors (Clauset + Woodward, 15).”
Evangeline Adams (American Astrologer) explores a mapBut, I’m still wondering about alternatives to these estimations, what is the best research body to design these kinds of models and who has the best open test data on the topic? I’m generally skeptical of predictions based on historic data without geo-reference these days, since so much of what happens depends on a cultural/historical/social context that is impossible to divorce from a particular place;  the general forecast of 19-46% chance of something happening in the next decade at a global scale is hard to conceptualize when you consider the umpteen geopolitical factors that might cluster likelihood around certain high-tension locales (Clauset + Woodward 14). Perhaps there will one day be a service to prioritize these factors and co-variates based on personalized social and surveilled data as Seth’s Worry App concept suggests:
“Worry is the very first technological solution that maximizes the benefit of mankind’s oldest task: anxiety.
Using this flow of data, the Worry app computes the things you ought to be worried about. For example, instead of needlessly wasting time worrying about a random event like being bitten by a brown recluse spider, the Worry GPS system can point out that based on where you are, you’d be better off worrying about a different, unpreventable event like being killed by a fire hydrant flying through the air or perhaps by an angry rooster wielding a knife. The Worry app will alert you to that, which dramatically increases the effectiveness of your worrying.”
Destiny awaitsI’m into anxiety optimization and maximized thought efficiencies, perhaps a maturation of my adolescent ESP :). 
In all seriousness, there is probably little statistical value in projecting these possibilities where I am currently. Fortune telling with so many variables can be complex, though the projections remain pretty unsettling.
But apart from all the speculative quant, there are some simple qualitative observations that I can make:
  • things are heating up every day here
  • any situation where “safety in numbers” is a paradox because avoiding congregation points (malls, churches, etcet) has become a way of avoiding conflictESPad is probs bad
  • at the end of the day, statistical randomness is a really unfortunate jerk who even despite your best precautions can allow for some pretty horrific happenings (a carjacking happened in my inner circle this week, for example)
  • there’s something broken about the fact that the entire reported anti-terror budget of Nairobi is less than my current apartment’s rent outside the city (both cases supposedly sustaining a month’s worth of expenses). If we drill down on that statement semantically, and not quite statistically, we can conclude that the collective safety of a city in a time of “imminent” crisis is roughly worth a one bedroom apartment.

Mathematical calculated? Feebly. Statistically significant? Probably. Totally unfortunate? Predictably.

Tagged , ,