Only a week before leaving for SxSW, it seems appropriate to comment on a topic that have been circulating in the digital libraries and labs that I frequent. The popularity of crowdsourcing content on the web and the increasing attractive quality of “learning communities” facilitated by a virtual venues comes at a particularly prolific and largely user-generated period in digital production. Fresh from reading some plucky Clay Shirky publications, I’ve eavesdropped and equivocated in conversations about the value of adopting folksonomies over structured ontologies, of allowing users to edit and contribute without restriction in promotion of access and intellectual sharing, and now I’m potentially prepared to extend my hi-5 to the crowdsourcing crew. In the spirit of tagging, social or otherwise, I’ve punctuated this post with some photos from a trip to 5ptz in Queens.
In the world of the digital library, departure from a structured file system or a controlled vocabulary makes most catalogers shudder; though in our increasingly digital world, traditional catalogs no longer can class and code content to be appropriately semantic for user search. Our records so carefully articulated in “magic number” MARC can no longer cope with the extensive linking and tagging that pervades our virtual universe; academic data repositories and yahoo indexes cannot hope to classify and predict all queries. Further, when dependent on user-generated content, contribution to any online collection becomes more about a kind of clever one-upmanship than necessarily about the polished professionalism of an academic posting or suited to the structured standard of an ontology (here’s lookin at you, reddit). An emphasis on participation and education seems to be the skeleton of ontology in the anything-goes environment of the internet. People are motivated to participate when a collective forum establishes an environment framed as ‘fun,’ < warning-neologism> a kind of “collaboratory,” </ warning> with more incentive to share ideas rather than proprietarize data.
In my own experience, collaborative and crowdsourced endeavors do characterize content in the online environment. For SLA@Pratt’s Linked Data event earlier this year, Tagasauris reps spoke about their partnership with the Museum of the City of New York to crowdsource cataloging efforts through a population of the MCNY’s twitter followers and the technology of Amazon’s Mechanical Turk. They based their initial idea on Shirky’s concept of “cognitive surplus” (and loosely on an addendum that people with lots of time on the internet can be “put to work” through workflow management with the veneer of “social network”). This might read as unethical, but it allowed them to catalog and collabo-tag 16,000 photos in a month, which was incredibly rapid and perhaps more appropriate in it’s folksonomic structure for an online search. Likewise, as part of SxSW’s Interactive conference, the accelorator and pitch sessions on the sxswsocial aim to foster creative collaboration and networking; I have yet to contribute much, but the forums are populated with really brilliant things and fascinating people. Despite the protective impulse of many developpers when it comes to their startup concept, educational and idea environments like this inspire contributions.
The vogue of collaboration for creative production is continuous across disciplines, to the point where the ‘digital humanities’ are potentially moot and maybe the ‘digital everything’ would be more apt. Examples in TED talks and webads abound, as the guardian’s open journalism project and Whitacre’s Virtual Choir would suggest. Still, I think it’s worth noting that even the web-ready classification systems that Shirky describes remain anchored in some kind of analog system; even delicious asks users to “dig into the stacks” of existing tags. In the context Kahle’s one copy library as recently featured in the NYTimes, even internet archival efforts or tinged with a comparative attachment to physical record-keeping. The physical footprint will outlast the virtual bookmark, though the current catalog seems ill-equipped to index variable and virtual media. It seems the only way to deal with this data is to admit that adaptive, crowdsourced tag technology is pitch perfect for mellifluous management of the data deluge.