Karen Calhoun: November 2008 Archives
In recent talks for library cataloguers on "the new world of metadata," I am often challenged for real evidence of a shift from traditional cataloguing tasks to more broadly defined metadata activities--and what kinds of activities are we talking about anyway? In other words, I have a sense that at least some of my audience members are from Missouri (the "show me" state).
This post provides some evidence that the transition from cataloguing to metadata work is well underway and invites your collaboration in providing more. I'm not aware of any systematic studies of the transition that have been conducted in North America on this topic (are you?--if yes please share). However, an excellent study from down under--Directors Views on the Future of Cataloguing in Australia and New Zealand, 2007: A Survey, by Jenny Warren of Monash University--speaks to the topic at hand. Warren's respondents are split on the question whether cataloguers have already transferred their skills to metadata work. Fifty-four percent of the respondents to the Warren survey believe the shift to metadata work is underway in their libraries; forty-six percent believe the shift has not begun (see question 12a of the survey).
Those respondents who believe the shift from cataloguing to metadata is already happening in their libraries were asked to provide examples of the transition (question 12b). A sampling of their answers follows:
o Participation in or leadership of institutional repository, federated search, or image database development
o Maintenance of non-MARC databases
o Advising on the creation, structure and metadata for other library databases or information systems
o Working with thesauri (beyond the traditional library ones)
o Implementing and maintaining Dublin Core or EAD
o Tagging projects
o Indexing projects
o Developing and maintaining crosswalks and metadata conversions
o Becoming the metadata guru for websites
This seems a familiar list to me, based on what I have observed in North American libraries in transition. Other significant areas of activity that I'm aware for cataloguing/metadata units in transition include help with learning management systems' metadata and information access services. One related example includes the significant contribution of the MIT library's Cataloging and Metadata Services (CAMS) department to MIT's OpenCourseWare Project. In fact, MIT's CAMS mission statement seems to me an exemplar for a department in transition, and the MIT approach to offering for-fee metadata services is typical of the direction that I have observed several university libraries taking.
I would like to begin an informal, occasional series on this period of transition from cataloguing to metadata in libraries. Would you be willing to share your success stories and hard-earned lessons on this blog? I believe your shared experiences could be quite useful to those working through transitional issues in their own libraries. If you have a story that you think could help others better define their own path forward, please comment briefly on this post, and I'll be in touch about next steps. Thanks for considering this opportunity to share your experiences and help colleagues in other libraries.
Students of the quality movement can tell you that the definition of 'quality' has over the past decades shifted from one based on conformance to requirements or rules (the classical definition founded in quality control of a product or process) to one based on fitness for use (a perspective based on how well the product or service meets people's needs). In a practical sense, however, any organization delivering a product or service--like a bibliographic database--has to pay attention to both aspects of quality.
In recent months my colleagues Janet Hawk, Joanne Cantrell and I have been examining what is most important about WorldCat metadata to a range of audiences; learning where WorldCat metadata falls short of meeting the needs of its users; and planning a course of action to sustain and enhance the value of WorldCat metadata to its varied constituencies.
Our forthcoming recommendations are based on information gathered from several surveys of different OCLC constituencies--faculty and graduate students, undergraduates, and casual Internet users on the one hand, and collection development, technical services, resource sharing, and public services librarians on the other. Each of these constituencies defines WorldCat metadata 'quality' in its own way. Quality means having the 'right' data, but the 'right' data varies across constituencies.
Many of our findings are relevant not just to WorldCat as viewed through its different interfaces or access methods (WorldCat.org, WorldCat on FirstSearch, Connexion ...) but to library catalogs and discovery interfaces in general. At the Charleston Conference, Janet Hawk and I made our first presentation on these generalizable findings. At some point the presentation will be available on the conference web site; in the meantime, I have uploaded it for viewing on SlideShare. Your comments are welcome.
There has been much discussion around OCLC's effort to update its record use guidelines. We have listened to the feedback and, as a result, have adjusted both the updated policy and the FAQ. We have modified our approach to WorldCat attribution (field 996). If libraries do not wish to retain the 996 field in downloaded WorldCat records, they are free to remove it. In addition, libraries are free to either add the 996 field to existing records they transfer to others, or not, at their discretion.
In this process, and in connection with my role on OCLC's Record Use Study Group, I've been asked more than once why OCLC felt the need to update its policy and why member libraries should support the updated policy. This post is an attempt to answer those questions.
Time for a change
In Web years, the Guidelines for Use and Transfer of OCLC-Derived Records, last updated in 1987, are not just 21, but as old as Methuselah. While the principles underlying the Guidelines have held up well with respect to sharing among libraries, the language and 1980s context of the document have made the Guidelines increasingly hard to understand and apply. The Guidelines have also been frequently faulted for their ambiguity about WorldCat data sharing rights and conditions.
What's the Web got to do with it?
The disruption that the Web would bring to libraries became painfully obvious when the Perceptions of Libraries survey report was released, revealing how much more likely respondents were to begin a search for information with a search engine (84%) than on a library Web site (1%) (see page 17 of 34). In our "attention economy," that's bad. OCLC's response to libraries' loss of attention in Web space, slow at first and quickening over the last couple of years, has been to start building Web scale for libraries (see Dempsey and Lavoie 2008).
OCLC's actions since I rejoined OCLC 18 months ago, as well as the passionate commitment of its leadership and staff, have been focused on placing library collections prominently on the main boulevards of the Web, right there with the Googles, Yahoos, Facebooks, and Amazons. WorldCat.org's purpose is not so much destination Web site as "switch," driving traffic from popular Web sites like Google Book Search to libraries' collections. The switching mechanism is beginning to work: in one recent five month period, 87% of the referrals to WorldCat.org came from search engines and other Web sites; less than 13% came from a typed or bookmarked URL to WorldCat.org itself. This week, a Hitwise commentator, reporting on downstream websites from Google Book Search, noted that 22% of visits from Google Book Search go to an Education website, with WorldCat.org the #1 Education website.
There is plenty of work remaining to be done to make the present cloud of disparate library systems, loosely tied together with WorldCat in the middle, attracting Web scale traffic from the Amazoogles, function reliably enough to deserve the name "Web scale for libraries," but the first hard steps are behind us.
To play the role it is now playing on behalf of libraries, OCLC needs to be a player on the Web, and not just any player, but an influential one. It therefore needs to be a Web company, with data sharing policies and practices appropriate to the Web.
OCLC has been severely criticized for its WorldCat data sharing policies and practices. Some of these criticisms have come from people or organizations that would benefit economically if they could freely replicate WorldCat. Other criticisms have come from a genuine commitment to openly sharing data on the Web in ways that will help libraries continue to thrive. Whatever the criticisms' motivation, the overarching point has been that the 1987 Guidelines overly limit WorldCat data sharing for new, Web scale uses of WorldCat data. At least some, if not many of these uses would be consistent with OCLC's chartered purposes and in the interests of the OCLC membership--especially those that will expose member library collections on high traffic Web sites.
Attribution and linking to the policy
An objection to linking to the policy goes something like "OCLC has no right to tell my library what it can do with its records." This is to overlook the 21+ year history of the Guidelines, which have governed what libraries can do with OCLC-derived records, albeit not always unambiguously. See for yourself by taking a good look at the Guidelines. Their stated rationale for imposing conditions on libraries' record sharing is that "member libraries have made a major investment in the OCLC Online Union Catalog and expect other member libraries, member networks and OCLC to take appropriate steps to protect the database." To that end, under the Guidelines, non-commercial record sharing between libraries is ok, commercial sharing is not, at least not without a separate agreement with OCLC. Sound familiar? The principles of the Guidelines match those underlying the updated policy--promote as much sharing as possible while protecting members' investment in WorldCat.
The difference is not in the principles, then, but the environment in which the principles are applied. The Guidelines came from the limited data sharing environment of the 1980s. The updated policy's landscape is the Web and the incredibly dynamic data sharing environment it represents. You couldn't tell from recent traffic on library listservs and blogs, but attribution of the source of data that is reused in another context is standard practice on the Web. Have a look at the license pertaining to Wikipedia, the GNU Free Documentation License, for example:
"You may copy and distribute the Document in any medium, either commercially or noncommercially, provided that this License, the copyright notices, and the license notice saying this License applies to the Document are reproduced in all copies, and that you add no other conditions whatsoever to those of this License." (Section 2)
The perception of some bloggers and posters is that, unlike an article contributed to Wikipedia, the OCLC cooperative has no claim on WorldCat records that a library has copied for local use; but attribution is not a condition of Use as defined in the updated policy (Section C). Attribution is a condition (and now an optional one) of Transfer--that is, the updated policy asks for a link to the policy to be carried in a copy that moves downstream from the library that is using that record. In this way the link is serving exactly the same purpose as the link to the GNU license covering the copying of Wikipedia articles--providing provenance and information about the rights and conditions associated with this piece of information floating around the Web.
We think the request to link back to the policy (using the 996 field) is reasonable and will actually facilitate and reduce the costs of downstream WorldCat data sharing on the Web, where metadata is constantly exchanged, remixed, and mashed up. Notwithstanding what we think, it is clear that some of our members strongly oppose or at best, do not understand OCLC's reasoning. And so, the burden is on us to step back, explain, and do our best to gain support for this change in the community we serve. That is the course we have chosen.
OCLC will go ahead with implementing the 996 field in mid-February, because we feel it is the right thing to do and in the long-term interests of the cooperative. However, if libraries do not wish to retain the 996 field in downloaded WorldCat records, they are free to remove it. In addition, libraries are free to either add the 996 field to existing records they are transferring to others (or alternatively make copies of the policy available to transferees), or not, at their discretion.
'No rights reserved" or 'some rights reserved'?
The Record Use Study Group began its work with an environmental scan of the data sharing policies of a variety of content and metadata providers on the Web. We learned that, while the blogosphere is noisy with proclamations that "data should be open and free," nearly all organizations have terms and conditions for sharing--that is, they reserve some rights over how their content or data is used and transferred to others. My presentation to the Libraries and Web 2.0 Discussion Group at IFLA a couple of months ago lays out the findings and conclusions of our investigation of the data sharing landscape (if you download the file you can see the speaker notes).
The Study Group was particularly influenced by the Creative Commons set of licenses. It is no accident that the structure of OCLC's updated policy mirrors that of the Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported license (Definitions-License-Restrictions-Representations Etc.).
Creative Commons licensing provides an alternative to full control over content and metadata, building a bridge between a world that regulates every use--that is, "all rights reserved"--and an anarchic world in which content and data are exposed to exploitation. Under the variety of Creative Commons licenses that are available, content and metadata rights holders can protect their works while encouraging the freedom to remix and reuse content and metadata under specified circumstances--that is, to declare "some rights reserved." (And yes, while we considered simply adopting a Creative Commons license, we chose to retain an OCLC-specific policy to help us re-express well-established community practice from the Guidelines.)
Meeting information seekers' expectations requires--demands--more open sharing of library data on the Web. The OCLC leadership understands this and is committed to making it happen. However, the financial realities of many organizations--including OCLC--require careful transitions to more open models. It is not free (that is, without cost) to build services upon the valuable metadata contributed by members, and it is not free to be the steward of WorldCat (I discuss OCLC's stewardship/curatorial role here). To survive, OCLC must recover these costs, and it must do so from members; its funding does not come from government agencies, from higher education, from foundations or donations, from advertising, from parent companies, or from venture capitalists.
OCLC's updated policy represents a first step toward truly modernizing WorldCat data sharing policies. The text and the practices we have proposed (e.g., the WorldCat Record Use Form, the linking field) are a balancing act. On the one hand, we have tried for a policy that will have the effect of opening WorldCat data to new uses by libraries, museums, and archives while fostering partnerships from which innovative noncommercial and commercial uses of WorldCat data can emerge. On the other, we have tried for a policy that will assure the continued economic viability of WorldCat and the WorldCat-based services provided to the cooperative.
Why should libraries support the updated policy?
OCLC's update to the Guidelines intends to modernize them for application on the Web, foster new uses of WorldCat data that benefit members, and clarify data sharing rights and restrictions. It is definitely not OCLC's intent to rein in libraries' or consortia's customary use and transfer practices that have been in place for years and that have resulted in important library resource sharing systems and union catalogs. That is not what the updated policy is about, at all. It is our intent to implement an updated policy that will allow us to better manage commercial uses of WorldCat records, to assure such uses benefit members, and to defend against uses of WorldCat that could destroy the cooperative.
The updated policy is a legal document. Being a player on the Web, working on behalf of libraries, requires that the policy be a legal document. I know that some librarians will be uncomfortable with that, but it is necessary.
OCLC's and the members' central asset is the WorldCat database that we share. It is our common investment, our "commons." I believe it is the right course to protect the commons.* Thus, as Garrett Hardin has suggested in his writings about the "tragedy of the commons," it is appropriate to regulate the use of the commons. OCLC needs to manage WorldCat data sharing to assure that benefit accrues back to the members who have invested in WorldCat, and that the WorldCat commons is not exhausted through over-exploitation. Protecting the commons means adopting "some rights reserved" as the data sharing model. While a data sharing model based on "no rights reserved" is a laudable ideal, if OCLC were to adopt such a policy, it is possible, if not likely, that the WorldCat commons and the OCLC cooperative would not survive.
We believe that libraries, the wider library-archives-museum community, and those they serve will benefit from the updated policy without placing our shared investment in WorldCat at peril. As OCLC's four decades of working with libraries to increase access to and use of the world's information demonstrates, sharing burden and benefit is a proven and remarkably durable cooperative model.
*I began drafting this post last weekend. It was interesting to read a blog post about the policy today that uses the same analogy.
*I began drafting this post last weekend. It was interesting to read a blog post about the policy today that uses the same analogy.