Tag Archives: data journalism

There Must Be A Pony Somewhere: Digging in Data to Find a Story

CartoonicornQuote investigator wrote a cute quip about the origins of this blog’s title quote (“…there must be a pony somewhere…”), and lately, it has me thinking about a job I share with many techy-journalists: digging through data (evidence) for a story (pony). I’ve commented on that a bit exhaustively in this blog, but the metaphor carries through to building a data journalism team, composed of a ragtag herd of unicorns, racehorses, and predominantly, ponies. Online Journalism Blog did a short piece about the taxonomy of journo-developers too, bulleting a few typical types (racehorses, unicorns, mules),  to which I’d like to add ponies before diving a little deeper into what this means in terms of characterizing a professional population by its equine analog.

At this week’s MIT Civic Media Conference, Joi Ito kicked off an introductory talk with a nod to his coder fellow, a “unicorn” journalism-coder-analyst that had just joined the team, so the metaphor has stuck with some steady citation and I think it’s worth discussing here. In the next few sections, I’ll cover a few adventures in geo-journalism, talks and projects I’ve done around mapping in the past months. Moreover, this will be a blog about our equine habits and heros in data journalism, and some musings on what media hackery earns in terms of recognition and reward.

Dev-Journo Taxonomies

zebracornThere’s an understandable spectrum of personality types and professional competencies in Data Journalism. There are the fantastic anomalies: unicorns; the hardy worker hybrids: mules; the strange and rare portmanteaux whose skills define along a folksonomic schema: looking at you zorse, zebroids, donkras. I gave a talk on Data Journalism a few months ago (check vimeo below), and the thesis of my presentation echoed the essentially hybrid aspects of the job.

Those born under the sign of the Horse are a flexible group of people. They tend to be stubborn when it comes their ideas, but they are also incredibly patient when it comes to hearing out what other people have to say. They favor straight-forward conversation, but avoid trouble where possible; a paradoxical combo, but one that makes the horse persistently fascinating as a sub-population in the animal kingdom.

Data Journalism in DR

So in the data space, why fixate on ponies as representative of some substantial sample population in the greater software engineering venn? Because ponies are slightly different than horses; capable of the same intelligence and empathy but perpetually twee-er and often assumed to be less mature. Some of the brilliance I’ve witness from millenials in the data journalism space has made me think that another branch from the taxonomic tree should recognize those whose aptitude is impressive in code but whose journalism background, and experience in general perhaps seems premature.

Pony Projects

muybridge-2When social media steps down from the free speech party, and while governments and institutions of modern social exchange continue to use networks as a way of monitoring and managing society, it’s often the critics and the activists who have to pick up the slack to produce objective publications and in this space the post-modern (and often, outsider/premature) workhorses of the data journalism space have something to contribute.

As a class, proto-journalists and data mungershave developed some tools to analyze trends and provide objective and dissected-unicornuncensored criticism of the information they represent. Zeynep Tufekci’s talk at this year’s MIT Civic Conference on citizen investigative journalism in Turkey gave a nod to the use of social media (and twitter feeds in particular) as infrastructure for collecting public opinion and fact-checking specious claims. Many tools for crowdsourcing, Ushahidi included, can be deployed to provide for citizen journos-ponies, smaller breeds of self-taught but domain-proficient reporters, with tools for reporting. And while much of this citizen-driven practice is perhaps under-promoted in the contemporary news space, some of the most renegade journalism efforts are sustained by citizens running depolarization operations on social media platforms in their home countries, as Zeynep’s talk suggested.

Pony Hierarchies

 

Part of the persistent argument in discussions that blend net neutrality, privacy and surveillance censorship revolves around how important crowdsourced and social content has become for developing honest and unbiased alternative reporting models globally. Though not to be confused with incident data directly, social media reports like CrisisNET’s Syrian Youtube Map and Conflict Map’s tweet and social media tracking plan provide these kind of windows into the world of social streaming to study crises. In analysing, contributing, and disecting social media content, pony-journalism has become a more dominant approach to assessing conflict and geo-journalism at a global scale.

Muybridge Motion Studies

In fact, arguments around how to classify the oft-hyphenated and obscure titles applied to data-journalists are more about the hybridity of their job descriptions and the range of skills they deploy than about the elegance of the metaphor. As an equine-hybrid class, we’re often trying to find new ways of developing and pushing content, a nod to the aggressivness and tirelessness of the horse behavioral type. But part of that race, maybe the most important part, is about designing content and news to appeal to people, to visualize data in new and yet intuitive ways. Our objective is to find ways to relate to populations, and in a sea of bar charts and statistical models, sometimes maps are the more affective way of relating complex digital data to a simple physical topography. That’s where the map making (mentioned above) comes in.

fancyTwo of the most relatable and persistently referenced data types in post-modern visualization are geo-data and time-series. Why? Because we relate to them, we can consider our perpective relative to time and space; they have become our touchstones for syncing digital and physical worlds. Overwhelmingly, the projects at this year’s Civic Media Conference demo sessions fell into some kind of mapping context, and I think that trend is telling for the direction of visualization schema and citizen journalism: What We Watch, a map of youtube trends; Terra Incognita, a Chrome extension for mapping exploration; Media Cloud, a collection of tools for monitoring and mapping media globally; or Cliff, a project to automate media geo-parsing, being a few among many featured projects. Tools like We Feel and CrisisNET are aimed at facilitating this kind of study, enabling study of social media and reporting strategies. In each case, it will be interesting to watch how they compete in the investigative reporting space; the race seems primed to recognize their utility.

Pony Prizes

BookAnimalsTo address another interesting aspect of the data-journo ecosystem, I’ll now pivot to another curious theme in the MIT Civic Conference and others like it: the concept of work- “family.” In keeping with the metaphor of this post, and I would argue that family in the case of a company or sponsor, is more analogous to genus hierarchies than to social kinship models. People who share a company share a type and a goal, they’re a team but one built on affinity, not consanguinity.

This is a family:

IRL family reference

This is a team:

Company-Family

Company-Family

A company/funder/sponsor/laboratory/media-outlet/workplace is a herd of ponies. As individual members, we are unique in our methods and backgrounds and generally attracted to the same trajectory, but probably more powerful in that dispassionate diversity which a team or herd-mentality affords, less complicated by emotional entanglements internally and therefore more competent at empathy externally (that is, with our users/subjects/sources for stories). In a recent HBR article, “Your Company is Not Your Family,” the author uses the analogy of sports teams and the mentions of the spurs made me think that the pony metaphor might be as ridiculously apt.

The Spurs stand out for the stability and longevity of their player relationships, yet even their current 13-man roster only includes one player from their first championship in 1999: power forward Tim Duncan.

The PrinciplesTo consider your company analogous to your family, is to cripple it by a lack of adventure. Families, while wonderful, are a default, they usher you to growth, but if all goes well, you flourish on your own. You want to build a company of people who are flourishing, and will continue to do so under guidance and not parentage.

Joi Ito concluded the MIT Civic Conf with a series of “guiding principles” at the media lab, and those statements reinforce all-the-more why a lab/company isn’t a family. A team can be built on shared principles, but they’re not the same as those on which a family is founded.

Follow your unicornnon-believerYour family pushes you, educates you, and prefers (often) your safety over risk taking, whereas your work, and your class (genus/type/subgroups) often push you to independent and outlier achievements unsanctioned by precedent and rarely “safe” in practice. A total aside in this blogpost, to be sure, but I think often data journalism professionals (and by extension, other political/social-professionals who put position before the public they serve) seems to allow confused allegiance to cleave them from simple human and social empathies.

This is a point I treated in a recent interview with Danish news about the relationship between developers and journalists. Nothing revolutionary, but at the time I compared the ideal scenario to one of mutual respect in difference, and not to a familial metaphor. My collaborators aren’t my siblings, they’re my colleagues, and the relationship is pretty different in my mind.

We sometimes risk an allegiance to an editor or organization over an allegiance to the public, and it’s important to remember that the protection and privacy of your subjects and sources is just as precious as that of your employer-parents, regardless of who is paying our salaries. Too often, I’ve seen people at conferences too proprietarily motivated to share ideas, too proud to admit that many share the same ones and have started similar projects. There was a lot of overlap at this year’s Knight News Challenge award announcement, and I think it’s fair to ask overlapping orgs to collaborate and share their plans and programs of research as the year progresses, though I doubt they’ll be held to this. Sometimes, considering your company like your family can confuse your objective to do good in the world and supplant it with one to do good for your own.cartoonicorn1

This brings up another aspect of social good work, and journalism worth mentioning here. Often, the competition in the data journalism space is built on a capitolistic motivation to secure funding and support and resist the superior publication of another outfit that prematurely scoops your content. In this fear, we privilege our company over our vocation, which is to spread solid news, to share it with the world. There’s no shortage of conflict and controversy worth commenting on, so the competition seems sad and contrived especially in the social good and open source space. But recently, I’ve been reading economic coverage of the pay-gap issue and have come to appreciate that this competition has deep roots, founded in our cultural resistance to recognizing social-good as grant-worthy.

unicorn shower

some related items found on the Pinterest “unicorn” keyword search

 

The most prize-worthy ponies deserve reward, and I think it’s interesting to consider how we approach compensation when the goal of your work is social good. The resounding answer seems to be: we don’t.

Econ-Theorist David Graeber’s recent interview on the trends in our financial sector indicates that we rarely value work performed with altruistic motives, and that we waste most of our workforce on “bullshit jobs.” While our intentions might be genuine, study of our current workforce specialization schema indicates that we dole out few directly productive (as in “product-building”) positions, and most work is “administrative” or “managerial”: “…[l]ots of people [in Graeber’s interview pool] said their basic function was to create tasks for other people.” One quote that struck me as particularly insightful:

Geoff Shullenberger recently that pointed out that in many companies, there’s now an assumption that if there’s work that anyone might want to do for any reason other than the money, any work that is seen as having intrinsic merit in itself, they assume they shouldn’t have to pay for it… ~David Graeber

You can read more about his provocative, and well-argued perspective, here, and while he applies his study to translations jobs, I think the scope can widen to anyone doing fulfilling, socially-conscience, and context-driven journalism, globally; we’re all in the information translation/transformation/communication business at root.

You know, you’re describing what’s happened to journalism. Because people want to do it, it now pays very little. Same with college teaching. ~ Thomas Frank

Upshot: not compensating people doing good, critical, and socially beneficial things in the world is crippling our perspective on geopolitics and progress.

Problems with Ponies Abroad

Other than economic obstacles to pursuing social good, there’s other hiccups to the hierarchies of investigative journalism that relate to how we privilege unicorns over the content they cover, and here we return to our discussion of mapping. When I was at a hackathon last month in Aarhus, Denmark, my team won the Guardian API award at the event not for building something incredibly revolutionary, but something quick that simplified news content into a digest for mobile journos.

ecard-horse

Our app was called GeoNewsies, and its objective was to allow travelers to search by country and pull down a digest of the news in that nation prior to, or during travel. A two-paneled webpage and android app, it pulled in the top 10 articles from the Guardian relative to a particular place (panel left), next to the top trending tweet topics in that place (panel right); a bit like thenews.im or other rss aggregate sites.

geonewsies-web

The interface was unstellar, simple, and arguably flattened the geo-political happenings in a place to a top 10 trends list, but our objective illustrated something tragic and important about how we process news media today, and maybe it’s not what you would expect. Our point wasn’t that people only can afford to read short blurbs and dramatic reductions of the richness available in pre-travel research, but moreso: often, travelers fail to self-educate about the context they are about to enter, and this unfortunately extends to even traveling journalists working investigative beats abroad.

ecard-unicorn

Sometimes, the best witness to activity in a particular place is someone on the ground an local; this is why so much social media analysis and source relations with citizen journalists remain important to our global understanding of news. Displacing a data journo-“unicorn” to code in a foreign environment is rarely as productive as sourcing information and accounts from the local population, and then enlisting the unicorns or racehorses to usher an idea to production; or better, training the local ponies and mules to race.

http://www.buzzfeed.com/donnad/stop-what-you-are-doing-and-look-at-ponies-in-swea?utm_campaign=socialflow&utm_source=twitter&utm_medium=buzzfeed

Scotland Tourism’s Sweater-Pony Campaign

Burak Arikan’s MonoVacation tourism visualizations speak to this touristic approach to documentation of place that has become our practice in journalism. Arikan built a projected mashup of the tourism video/commercials of many nation, exploring typical symbols and their geo-contextual meanings relative to the nation of video production. Horses were a trend, repeatedly used in travel commercials to express freedom and tourist wimsy, perhaps. Abstracted a bit further from the original project focus, and deftones - because obviyou might consider the horse comparison to data journalism as a sometimes apt description of investigative practice: short sprint production and reporting with often unfortunately abbreviated context: a tourists’ view of geo-politics. Often a foreign media outlet’s assessment of the on-the-ground occurrence in one place lacks the depth of historical and hyperlocal understanding that social media reporting/analysis can provide if controlled, curated, and harnessed to meaningful ends. Oualwaysr attention span for international news is something that perhaps can’t be corrected but our approach to economizing a broader range of opinion and local perspective is something that might be best achieved with social analysis and local data journalism training.

As someone who came rather late to code; I’m pretty comfortable advocating the premise that code can be trained, and not limited to the hierarchies of mythical creatures. I’d argue that researching for a story involves a healthy amount of logic that is more intuition and contextual/location knowledge than technical skill. Compelling news applications about a particular time and space are ones that root in a thorough knowledge of the geo-politics of a place, and often those come through most clearly from content generated by local mules, rather than unicorns.

Post-HorseRace: Project Persistance

equus-evolutionIt’s safe to say, however, that team assembly and the logic of our production pipeline aren’t the only concerns in developing sustainable news applications. With news apps, we deal in a particularly friable media; one whose impact often limits to the extent that it’s API/library/dependency components have yet to deprecate. When we think about endurance and the persistence of applications, we sometimes think about the ephemerality of our work.

What happens when the horserace is over; how will we remember our efforts?

This worry is not new of course, and its one that’s been persistently suffered by media producers and providers globally. Born digital projects are so vulnerable to almost immediate atrophy, and while you may make history with a web-based piece; the probability of it outlasting  even newsprint articles from 30 years ago is pretty pathetically weak.

We’re tackling that next month (July 23rd) at the 2014 Digital Preservation Conference in DC, if your’e interested, so check it out. Our objective in presenting is both to survey the state of media production today and discuss preservation options, but also acknowledge some technological trends we should avoid. Contemporary product development is replete with light-bulb conspiracies of ‘planned obsolescence’ and at the opposite spectral pole, stories of technology built for eternity. Somewhere in the middle, there’s a place for news apps in our geo-political history; a few pony programmers might just figure-out how. 🙂

Finish-Line

ilovethisTo sum up this (rather-too-longform) piece about pony personalities in the geo-newsroom, I’d say that a lot of our professional expectations as journalists and developers presume a few narrow ideas: firstly, that a simple taxonomy can define competence in global news coverage, secondly that companies can operate like parents, and thirdly that the integrity and sustainability of your work are secondary considerations to the general scheme and scope of a path defined by paternity.

I’ll close with a link to my MIT Civic Media Ignite slides (presentation, references); it’s a talk about teleportation and mapping, but no less fantastical than the expectations of data journos globally (that we tell the future, that we perform our pony tricks on demand, that we manage to t[rans/ele]port). An area of growing interest in the data journo world is how we manage to create compelling narratives about remote happenings, and often these are through our modern tools of teleportation (things like Ushahidi’s BRCK or OpenNews’ Keyblur for deploying networks without Internet, or applications like Crowdmap, CrisisNET, and Media Cloud Focus, helping us to understand global coverage and crowdsourcing context from operatives on the ground. These applications are among the suite of devices at our current disposal for feats of science fiction fantasy, bringing our ambitions of teleporting and unicorn reporting all the more close to our realities of remote monitoring and pony-journo practice.

 

Tagged , ,

That’s What She sed: !awk Lessons From Fun[ctional] Programming

Somewhere at the intersection of unexpected genius, linguistic mastery, and femininity there’s a trope of compelling film/fiction that goes something like this: a character (ideally a woman or weakling) speaks a language that no one expects and suddenly reveals a competency or comprehension that strengthens his or her position, provides for some comedy, or drops a beat of provocative timing. This kind of surprising exolingual + monolingual situation is common and interesting. I’m thinking Daenaerys Tarygarden speaking Valerian, or when Nancy Travis speaks Russian to her cat-callers in So I Married an Axe-Murderer, or that scene in the Goonies when Corey Feldman, a child, gives the maid surprising instructions in Spanish, or those times on the subway when I can tell who the françaises next to me are gossiping about and giggle to myself at the semantic secrets I’m privy to by virtue of closet bilingualism. It’s a common and compelling scene, not one wholly relegated to spoken tongues; it has its echoes in computational languages too.

Unexpected fluency in a programming language is fascinating. There’s still an interesting amount of surprise that accompanies any woman speaking intelligently at a tech conference, or a child-ish programming prodigy who sells his company at 18 and enjoys wild and precocious success. With that in mind, I decided to explore some languages recently that I had little experience with, if only to investigate their utility, and build up some surprising and cinematic techcred of my own.

Ontology Web Language? (http://www.w3.org/2001/sw/wiki/OWL)

Informing this, a recent and short tumble into the land of Game of Thrones led me through the wikipedian labyrinth to LCS, this linguistic non-prof that constructs languages (conlangs), composed of member constructors (conlangers) and the responsible creators of languages like Klingon in Star Trek and Dothraki in Game of Thrones. My tangent into a trope sparked some curiosity about how we define computer languages and how we use them thereafter, and the authority of the inventors of these languages.ll-sarcasmantics

Like other languages, computational tongues are often indexed by stereotypes, but unlike spoken conlangs which have evolved to express a multiplicity of (in)translatable nuances, CSlangs often are more objectively restricted by a purpose, not developed to express all of the things but rather to accomplish a task. Valarian is “the only language for poetry,” while Dothraki is harsh and gutteral like its speaker population; SQL is a “special-purpose” query language, Objective-C is a “general purpose” object-oriented language, Visual Basic is the “most-WTF-y” language; but even in these stereotypical distinctions, coders contend about what these adjectives might mean, and who is best suited to speak these languages in such-or-such situation. And personalities presumptions align to these -types as well: women, being generally lovely and fluffy, are unlikely to speak a brutal and ugly  bash shell scripts….they should be front-end programmers because pretty, and easy. 😦

And despite this, I’ve been investing a bit of time in the prelims of every data project, somewhere between scoring a raw pile of data and shaping it up for a visualization, always accomplished via some language/library. New projects and experiments always make me wonder if there’s a better library, plugin, resource or language to articulate my objectives and otherwise get me the results I’m after, which in this case involve a bit of OCR and semantic analysis, batch processing and cleaning and file pruning where language all-around is pretty important. For this round of tech adventures, I settled on SED, but I’m sure the operations I’ll be performing in this post could be fairly accomplished by other languages. What he sed.Further notes on my actual adventure can be found here, but as a quick suite of examples, say you have a batch of files whose extensions you need to change:

blog-origLS

You can do this with text utilities:

blog-textutil

This converts all .docx files to .txt in a given directory (ignore the bogus .pdf dud).

blog-textconvert

Then, say you need to restructure file names in a directory so that you can sort them, as I wanted to by date, but your current file format is something like this:

23NY080214.txt Or ##-NY-DDMMYY.txt

You can reorder characters in a set of files by running a sed script like this:

blog-sedreorder

This tells Terminal to break up the file name by “.” to represent characters and then re-order those parenthetical entities according to the numerical set order at the end of the line (4\3\2\1) where 4=YY, 3=MM, 2=DD, 1=23NY. It makes that reorder actionable for each (*) .txt file in the directory.

blog-rename text

None of these applications is really what sed was “made for,” but I found them pretty satisfactory implementations of the language for my immediate need. Taken together, all this got me thinking about linguistic development and about the “meta”-languages of programmatic thinking, the classes and cases of computational articulation that lead us toward fluency in one or more languages, preference, and eventual specialty in the operations most suited to that lexicon.

newLangsWhile living on a continent with ~3,000+ spoken languages, pidgins, and regional dialects, I also started thinking about how the diversity of computer languages compares to other paroles of parlance, and how our systems for organizing and inventing new tongues might best map to eachother for optimal productivity. There are rough guides for this kind of crosswalkexpected hierarchies, rankings, paradigm comparisons, and schemes of which languages are appropriate for the most hardcore hackers (see also, the “Real Programmer” fallacy).

But to redirect the conversation to a more critical and less-subjective breakdown, it seems appropriate to consider the semantics of not just the language itself but also its classification schemas in trying to assess their flexibility and purpose. One of the beautiful things about objectively breaking down languages by purpose, is that they can be ranked according to their flexibility and utility, their merits, rather than subjective judgements about their syntax. As with most anything in code, bash, or whatever scripting, part of the learning process is absorbing typical commands and the rest is playing with how to appropriately pair them for more complex operations (roughly: what commends are possible and how to link them). Snooping through Stack Overflow can usually get you pretty far on the first one, the second comes later, when repeated compartmentalized operations become exhaustive and your frustration has driven you to the point of investment in some serious study or thought on how to most efficiently arrive at your goal.

comp_linguistics

languagesFor this project, I selected sed because I’d read about its utility for my purposes. I’ve got several years worth of newspaper and journal data to convert from various file formats to one, and then rename in a batch before diving into the actual contents and cleaning and reformatting. Sed seemed appropriate for this, I could probably do it in Python or bash or JS or somesuch and maybe there’s someone who’s already build an online GUI that automates all this…but I was looking for something that worked and something new to learn, a new dialect to surprise myself with. While I felt stupidly proud when surprising others with this workflow and earning the ‘hacker’ merit badge du jour at work, I didn’t choose it to be cool, I chose it because it fit my needs. I chose it because sed is simpler than awk an perl, syntactically and performatively, but it provides a variety of text processing and regex support operations, and suits most things I would need in combination with other commands. I’m still at the ‘hello world’ stage with some of the magic of stream editors, but sed had some pun promise for the title of this post so I thought I’d go with that and see how far I could get with the operations that I wanted to perform.

And this is where I started thinking, perhaps there are other language paradigms to adapt for this purpose. Taking tips from symbollic and declarative languages might be useful, if only conceptually. I’d like to type in my desired output and allow the language to fumble through the mechanics of its implementation. When in SQL and I’m select from where’ing, I’d like to sed-ify that operation for data cleaning. Select *.csv from _ directory where _[date].csv. In researching and polling friends about addtional “sql-ish” (pronounces “squish” please) languages, I came across a few interesting features that I have yet to test in practice but seem like pretty cool operations to incorporate in a meta-sed lang.

In the past, and via wikipedia, I’ve heard  “declarative” applied to XSLT. Your blocks ll-intentof code are statements, declared like: “when you get to {this} w/ property {that}, do {these things}.” You can declare them in any order and they will run in the appropriate sequence.  However, is XSLT “declarative” according to all definitions? Diving further down the language research rabbit-hole had me questioning more of what “declarative” means in this context. Despite the overwhelming arguments you can get yourself into when defending the merits of one computer language over another, the terminology used to rll-morphefer to different programmatic concepts and classification schemas can be vague, misleading and largely unhelpful if you approach them as a foreigner, with other linguistic fluencies influencing your translations. The term “declarative language” for example can reference “non-procedural”, but that is also valid for the other language styles. The author in this article linked above uses “where you declare…” to define his term of “declarative language.” With XSLT, you write blocks of procedural code, called in reaction to something in the source doc, otherwise unlinked to the calling procedure (“where you declare…”).

If you think of lots of front-end and web prog languages, they pretty much fall into this category: small blocks of code linked to a user interaction, operation (onClick listen –> then run {this}). The author features a bunch of interesting language paradigms like concatenated languages, but there are other, now (perhaps) obsolete meta-languages that also address these concepts with more flourish and in many cases the same hiccupy classification semantics that can obscure their utility. Like what about languages made to describe algorithms, APL-ish tongues with general and placeholder operators, “compression functions” to apply operators pairwise to members of a vector, right to left programming execution sequencing. Or what about REXX, a shell scripting language using juxtaposition and ‘|’ interchangeably for concatenation, using blanks as operators. Even the semantics of concatenation have been through debates about the appropriateness of the term to “co-chain” vs. just catenate (“chain”).

Both conlangs seem to require quite a bit of syntactical adjustment but have features I’ve never seem echoed in other languages. And still, the point is, no one remembers these syntactical idiosyncrasies, languages are remembered for what operations they perform and how well. Our memories are operation-orientated, perhaps not-solely focused on syntax. Are these lexicons appropriate for high poetry, are they guttural and direct; what do they evoke, how do they surprise?

Plus, I’m wondering if I even understand how to appropriately use and manipulate a language when I’m not sure how to best describe it. Taking a page from my spoken fluencies, those languages that I know best and feel most comfortable using in practice are always those whose grammar and constructs I can explain and justify with greatest ease. There’s little mastery in the unwritten blundering I do in Swahili or Creole, though I’ve spent serious time in places where they were spoken; English and French, the product of formal study and informal fumbles, I totally own like whoa.

lang

In programming there’s a declarative and imperative paradigms; likewise an imperative mood (expressing commands) in most spoken/written languages. One might read Dothraki or Klingon, a brutal class of LCS languages and particularly “imperative” in their ‘commanding’ manner, unapologetic guttural articulation. But what might be the meaning of declarative? Do many people know? The internet suggests not. As per uszhe, everyone has his own definition, disambiguations + citation needed, wikipedia, hint hint.

So what’s the best language to communicate what we want, when the writing about languages is indecisive and muddled? Probably, and unsuprisingly, the language you speak best. True masters can adapt languages to their purpose, but most still recognize that CS languages are freighted with an intention, and this limits their applicability to all situations. The ambiguity of classifications like “declarative” in reference to a few languages or other terms applied to and restricting language adoption crumbles when you consider languages for their ideal operations, and not their syntax or semantics. What is the purpose of the language, how to absorb typical commands and how to appropriately pair them for more complex operations? Operation-oriented language selection (ruby is good for… and …) rather than grammaticentric (ruby syntax is “bloated and confusing“) might be the best approach for study; one that respects the romantic tropes of surprise, and pushes you to build a vocabulary based on the declared objectives of your goal, rather than the pretense of some predefined language hierarchy.

That, appropriately, perhaps unsurprisingly, is what she sed.

ilikethisNote: I like to alliterate my titles so if you thought this would be a post about functional programming and are now disappointed, you should check out my friend Jonathon’s post on functional programming coming out in Smashing Mag at some point in the soon, or this explanation series which is fairly brill IMHO.

If you wanted more stream editing and shell scripting, some resources you might enjoy are this one, and for awk reading (the best!), this one.

Tagged , , , ,

Crowd-ed + Coordinated: FOSS in Africa

“There’s no more powerful force in modern society than the news. It shapes how we see the world, what we judge to be good or bad, important or silly, right or wrong.”
~ Alain de Botton, “Have you Heard the News?” Psychologies, 4/2014

In the April 2014 issue of Psychologies Magazine, Alain de Botton’s interview discusses his new book The News: A User’s Manual, and his philosophical reading of the news as trending toward more personal, more philosophically predictable. It’s perhaps significant that I’m reading this article in an airport news stand out of a pop magazine, rather than reading his book. More on this trend in abbreviated news ingest later…but for now, his points about our pot-boiler appetite for the news does well to introduce some of my recent professional happenings, perspectives on crowd-driven data journalism, and particular perspective on crowd-data programs in Africa.

Nairobi - Crowdmap of Tweets

In Nairobi, while the news has been of late focused on other topics, the last two weeks IDLELO Conference Badgesof my workflow concentrated on two conferences, a IDLELO: FOSS conference and a Global Innovation Competition for citizen-driven government initiatives; they share crowdsourcing and open journalism as themes. I had the pleasure of speaking at the IDLELO-06 conference, supporting Ms. Angela Odour’s talk on Ushahidi prior to preparing my own with James Raterno and Daniel Cheseret of Internews-KE. Of the few journalism organizations presenting, we applied the free-and-open-source-software (FOSS) theme to investigative news reporting and interactive political commentary. Our talk was a case study in health projects, demoing three interactive news stories from this past year at Internews-Kenya. Each interactive delved into some aspect of health monitoring in Kenya, spanning a spectrum of topics from medical services availability to mapping the outposts and effects of extractive industry across the country. While the details and data behind these stories are important and interesting, the presentation in each case was paramount; TL;DR the realities of healthcare and economic/industrial health of the nation were best communicated via interactive charts, and Internews’ series of Data Dredger infographics. The refrain of this and de Botton’s Psychologies perspective persists: attractive and interactive stories, stories that engage with personal, psychological topics, stories that illustrate rather than allude to data are driving our journalism programs and our teams.

Crowdsourcing Comic - XKCDAnd part of that means democratizing the newsroom to a broader population of citizen journalists and crowdsourced contributors, part of this also means broadening our view of where data journalism trendsetting is happening in our world, but to persist on these points, let’s move off the African continent briefly.  Among the most popular articles in the NY Times last year were approachable, interactive pieces; it’s not unreasonable to conclude that the appetite for news often bends to people’s visceral interests, regional perspectives and even “popular biases” as de Botton suggests in his Psychologies interview. Likewise, the Guardian’s 2013 popular titles for most popular articles (among Snowden and the Boston bombing coverage) include the following:

  • Why have young people in Japan stopped having sex?
    3.2m page views, 1,263 comments
  • Michael Douglas: Oral sex caused my cancer
    2.0m page views
  • Royal baby: Duchess of Cambridge gives birth to a boy – live
    1.5m page views

Global Innovation Challenge CrowdThis is not to suggest that the most popular news publications follow predominantly potboiler subject lines, but rather to note that there is a persistent appetite for pop culture throughout all news sources and dissemination platforms, irrespective of reputation. Mixed in with the seriousness and severity of crises worldwide, the presence of pop culture news commands significant attention; perhaps Global Innovation Challenge Collab - Nairobi, KEreflecting an appetite for popular and approachable media. When de Botton claims that “the ideal news would take into account people’s natural inclinations…it wouldn’t start with the wise, good, or serious outlooks,” I thought the judgement was a bit unfair and dismissive of journalism’s future, but maybe, on reflection, not so removed from reality in journalism’s present (Psychologies Magazine, 54).

This media appetite is agnostic to journalism hierarchies, persistently attracted to KE-MAVC8personalized stories, that show how one girl lives in NYC projects, or how a population’s accent differs according to regional divisions. We crave a personalized experience with the news even in the most distinguished publications, we crave a flat structure of open contribution, where the stories are interactive, where we can comment publicly in the thread following each post, where the content is sometimes crowdsourced, and the platforms are participatory. Our appetite for pop culture parallels publication output. In a digital media landscape where everyone from Buzzfeed to Fbook to O.K. Cupid have a data science team, our population of increasingly connected readers is interested in the personalized analytics of their networks, in the data science that drives our personal lives and pop culture as much as our professional publication platforms, and sometimes, in how all of these data fuse.Lagos - Crowdmap of Tweets

One way to adapt to this is to invite more contributors into the news reporting community from the reported community; to flatten the reporting structure, to amplify the data-driven projects that drive the page view counts often used to index our community impact. Promoting “popular” media isn’t just about echoing celebrity gossip and simplified story-lines but rather developing a sensitive authoring practice, crafting stories that readers can identify and interact with, and this trend is carrying into bootstrapped newsrooms across the African continent and throughout the world. In supplement to interviews, we crowdsource data collection in the way of Ushahidi, instead of lone-wolf work of an re-located investigative journalist, we train teams of indigenous journalists to report on their own local communities in the way of Internews. I’m privileged to work with organizations actively contributing to this type of globalized citizen journalism and crowd-reporting, likewise privileged to work with journalists when I am at best an “outsider-[FOSS]-artist.”

This is not new science of course, most established papers have a data teams these days, and it’s not uncommon for teams of developer-journalists to collaborate on investigative pieces, but to recognize the trends as reflective of an interest in crowd-driven projects, and citizen-journalism engagement globally is perhaps important and worth considering as we re-evaluate where journalism is, and where it is going.

Accra - Crowdmap of TweetsCrowd-sourcing information, crowd-funding and crowd-feedback loops in the journalism community are more popular, and not just in the USA. Analytics permit us to track what our crowd of readers actually reads (or at least what they click on), to adapt our stories and investigative practice to suit those interests. Though we still have a rockstar reporter hall-of-fame that celebrates individuals and their contributions to the industry, with data-driven projects, we can now appreciate more than ever, that often, and maybe always, the byline includes a team, a small crowd of developers-journalists-researchers working on a comprehensive and data-informed investigation.

“I doubt if it makes much difference, frankly, but at the margin I think that we’re moving to a kind of journalism that is more casual, more informal, more personal, and a very formal byline seems as out of place as a three-piece suit in the newsroom.”
~ Nicholas Kristof, “What’s Missing in my Byline,” New York Times: Opinion Pages, 1/2014

Tunis - Crowdmap of TweetsAnd this isn’t only happening at the New York Times or The Economist, it’s happening in Africa too. This brings me to the second conference happening of the past two weeks of work. At this week’s Global Innovation Challenge week in Nairobi, we’ve been working with teams of selected delegates from 10 countries around the world, teams who are working to connect their citizens more directly with their governments and foster policy change through open data. This type of effort can read as a quixotic ambition, but with developer and data-driven programs, it is possible. Johannesburg - Crowdmap of Tweets

Further, it’s noteworthy that all of the delegates are paired teams, not-lone crusaders, these efforts are built on partnerships between multiple contributors (developers, political activists) and multiple institutions, on crowd-driven programs meant to collect a maximum of opinion and surface a population of opinions from a representative sample of constituents. Supported by Ushahidi and hosted by iHub, this week of conference talks, pitches and programs is designed to foster more crowd and community driven data reporting across the globe, and model the crowd-centric trends so observable in our increasingly personalized and popular media.

Crowd-driven journalism and FOSS initiatives have in one respect opened the community to a broader population of self-taught developers and scrappy reporters, and also broadened the potential for citizen-sourced, -funded, -voted journalism projects. The crowd will doubtless drive even more data projects in the future, and craft a more personalized and popular media with a global scope. Crowd + Africa doesn’t have to mean crisis mapping or violence, it can mean participatory reporting and progressive reform, it can mean a program of re:activism, or react-ivism, piloted by a crowd of programmers and a ragtag group of pirates and outsider journo-artists. We’re working to amplify the crowd, and data-driven newsrooms internationally, in keeping with up the [journalism] Joneses.

Ushahidi Ecosphere Diagram

To that end, and in conclusion, I leave you with a link to our Ushahidi community survey, an effort on our part to make crowdsourcing a part of our own analytics and feature development workflow. Please fill it out so that we might improve our software and help other investigative journalists spin up custom instances of geo-local data collection all over the world:

HELP US OUT, FILL THIS OUT:

CROWDMAP COMMUNITY SURVEY

Recent Happenings:
Current:

Upcoming:

Images in this post courtesy of XKCD, IDLELO06, Global Innovation Competition, and FloatingSheep.org (African tweetmaps)

 

Tagged , , , ,