Images and Webscale Demand
By John MacColl
This post was contributed by John MacColl, an OCLC colleague who joined RLG Programs in November 2007 as European Director, having previously worked in a number of UK academic libraries since the mid-1980s. Based in the University of St Andrews in Scotland, his role is to work with the RLG European Partner libraries, and to lead programs and projects in areas related to his expertise in scholarly communication and digital library technologies. -Karen
I'm John MacColl, and I was in The Dutch National Archives last month with Karen Calhoun, where we were both speaking at the Dutch Customer Contact Day, which OCLC EMEA runs each year for their many Dutch customers. Following Karen's presentation, I gave a presentation whose tongue-in-cheek title was inspired by the organisation hosting the meeting - Are archives the new libraries? It also reflects our growing interest in helping research libraries digitize their unique and rare materials - archives as well as rare books and manuscripts - and to put these materials onto the web as a priority. Are archives the new libraries? was therefore a teasing title, suggesting that much of the business of archives is now coming to the fore for research library managers.
There are significant problems associated with exposing this material to the interested minds which evidence suggests are out there and hungry for it. Much of it is fragile, and traditional library practice has been to prioritise conservation and preservation ahead of access. This has resulted in large quantities of rich research material effectively being turned into hidden collections. We have focused on this problem many times in different ways over the years. See, for example, Merrilee Proffit's blog post on a talk earlier this year by Richard Ovenden from the Bodleian Library. Archivists, however, received a wake-up call in 2005 through the paper produced by Mark Greene and Dennis Meissner, which John Chapman mentioned in his earlier post to this blog, and which made a huge impact in the archives world by advocating a minimalist and demand-led approach to cataloguing, in order to address the hidden collections problem.
Libraries are learning - partly from archivists - that in the digital age, satisfying demand is a different proposition from the one they were familiar with in the print age, when scale in public document management was something they largely controlled, and usage, impact and demand were only crudely measurable. Now we have webscale, and we can see what users are looking at and downloading. Libraries own content which has the potential to be hugely popular and useful in this webscale world, and one of the most interesting tests of this in recent times was the use of flickr by the Library of Congress at the beginning of this year, when it put up 3,100 digital photographs of news images from the first half of the 20th century, in the newly-launched flickr commons.
LC's experiment with images on flickr commons well illustrated the hidden demand for hidden collections. My colleagues and I use it as an example frequently in presentations (I stole the slides from one of my Program Officer colleagues). Slides 29-33 of my presentation tell the story as it unfolded and was told, with some astonishment, in the project blog. Twenty four hours after the images went onto flickr, they had attracted over a million views, with 420 images having received comments - and every image in the collection having been viewed. The impact could not be overstated. While the images could of course be viewed on the LC's own website, the difference is that the LC site (currently the 3,236th most visited site on the web) attracts nowhere near the same amount of web traffic as does flickr (currently the 31st most visited site on the web). As the slides go on to show, LC cataloguers found that many of the user comments on the images were extremely helpful to them in improving the image metadata, and 89 images had their records updated as a result.
The LC's use of flickr was picked up in later postings by Lorcan Dempsey and by Günter Waibel, who provides an interesting update on the experiment, reflecting on the question of balancing webscale provision (should huge numbers of images be provided to flickr all at once?) with human-scale appreciation, as a community of interest developed in the growing collection. The Library of Congress is still evaluating its pilot, and one of its issues will surely be the scale of cataloguer effort which can be dedicated to sifting through user-contributed comments. That's a challenge. But the scale of the demand for these rich examples of research materials is clearly in evidence, and the Library of Congress has now been joined by several other museums and libraries in the flickr commons, including - nicely - the Dutch National Archives (no images of Karen and me on the podium in their photostream however).