Data Sharing, Libraries, and the Landscape of the Web

|
Comments Comments (7) | Bookmark or Share Bookmark or Share

On Monday I had the opportunity to speak in Denver at the ALCTS Forum "Creating and Sustaining Communities around Shared Library Data."  LJ's Norman Oder provides a substantive, fair summary of the 2-hour session.  For those with an interest I've made my slides available on SlideShare.  

I found the panel presentations and discussion with attendees constructive and helpful. I took quite a few notes so there is much to ponder. Besides sharing the URL of the slides, I thought I would also offer some thoughts about one of the topics that occupied the speakers and audience briefly, the role and significance of WorldCat.org.

One speaker at the Forum wondered about the need for WorldCat.org as an aggregation of information about library collections.  From a different source this past week, I have heard OCLC's commitment to comprehensiveness in WorldCat.org misrepresented as an aspiration to monopoly standing in the library world. While the OCLC database is the largest of its type and in some way serves around 69,000 libraries in 112 countries, considering the number of libraries in the world and the number of cooperative services/catalogs they use, such a notion about OCLC's purpose for WorldCat ranges from the misinformed (in North America) to laughable in most other places in the world.* Instead the purpose of growing WorldCat.org is to begin attracting more attention to the world's library collections on the Web by providing a "point of concentration" to collect and drive traffic to local libraries or consortia.

I would argue that WorldCat.org is a good thing for OCLC member libraries already, and it has the potential to become a great thing. It brings eyeballs to library collections both collectively and individually--attention that otherwise will remain monopolized by the most successful Websites. For an interesting perspective on the landscape of the Web, see the map that Information Architects Japan has created.  IA's map overlays influential Web sites on the Tokyo-area train map. 

What IA's map tells us about libraries on the Web is not new. The loss of information seekers' attention to traditional libraries became painfully obvious four years ago when the Perceptions of Libraries survey report was released, revealing how much more likely respondents were to begin a search for information with a search engine (84%) than on a local library Web site (1%) (see page 17 of 34).

WorldCat.org, introduced in 2006, is a response to brick-and-mortar libraries' loss of attention in the Web landscape. The strategy is to make library collections everywhere much more visible in the main stations on the Web. Today, WorldCat.org is a destination on the Web, yes. More importantly it's a "switch," driving traffic from popular Web sites like Google Book Search to 10,000 OCLC member libraries' collections.** Very recently, a switching mechanism from the Web to many OCLC member libraries has begun to work from mobile phones, as described in the announcement of the WorldCat Mobile pilot. 

While there is plenty of work remaining to be done to consistently and reliably connect searchers from popular sites to library collections via WorldCat.org, the first hard steps have been taken, and OCLC is committed to making WorldCat.org work better for more libraries. The switching mechanism does work: a few months ago, a Hitwise commentator, reporting on downstream websites from Google Book Search, noted that 22% of visits from Google Book Search go to an Education website, with WorldCat.org the #1 Education website. There is reason for optimism that the connections to library catalogs from the Google search engine via WorldCat.org will improve, based on recent exchanges between Google and OCLC.

Going forward, for WorldCat.org to be an effective switch to libraries, it needs to be more comprehensive and connected.  To achieve its potential to help libraries, it needs to be a "point of concentration"--a large store of information about the content and whereabouts of library collections around the world. Its links need to be embedded and more visible on more of the Web's busiest sites. In these ways, WorldCat.org can help online travelers pass through the main stations of the Web and disembark at their local libraries, wherever those libraries are, from Ohio to Oslo to Okinawa.

 

---------------------
*Outsell's 2008 report estimates there are 484,990 libraries worldwide--109,795 in North America.

**To try it out, go to Google Books and search "everything is miscellaneous," then click "find this book in a library" on the book description page.

 

Comments 7 Comments

Hello Karen,

According to the Library Journal article you mentioned, you "clarified that the FAQ was indeed part of the policy." The way I have read the policy has lead me to be skeptical (as the article report others are) that this is so. Can you confirm that the FAQ is legally part of the policy? If so, shouldn't that be put in the policy text?

Thank you,
Edward

Karen Calhoun Author Profile Page said:

Hi Edward, I checked with OCLC's legal department before making that statement at the ALCTS Forum. They said that any legal interpretation of the revised policy would include the statements in the FAQs and it would be very difficult for OCLC not to stand by them.

In any case, now that the Review Board is in place, we do not know what the policy will look like in the end and how it will relate to an FAQ document.

Karen

Tim Spalding said:

Regarding the FAQs, you might want to remove the Policy's statement (§E7) that it is the "final, complete and exclusive statement of the agreement of the parties with respect to the subject matter hereof." That's contact language 101--the contract is the contract. If you don't agree, I hope you haven't rented an appartment or bought a home recently.

On the rest, here's why I think this is wrong:

1. The web already has "switching mechanisms." Google and other search engines are the most obvious. The gazillion other "filters" are the other. Libraryland only needs a single controlling site because library catalogs do not usually allow spidering or permanent links. One might go so far as to say the web's switching mechanism is the web. Libraries aren't on the web. Instead, OCLC is acting as a sort of intercessor.

Do a Google link search sometime on links to Library of Congress records. You'll see that every day people make links into the LC catalog from their blogs, syllabi, bookmark managers and so forth. They don't realize all their links will die in a matter of minutes when their LC session expires. Libraries are not part of the web.

2. The whole "eyeballs" idea is out of touch with how the web works. Before the rise of search engines and social filtering the web was a promotional place--lots of dot-coms bought banner ads in 2000. Search engine companies bought television ads! Back then you had to promote yourself just the way you did with old media.

That world is fading now, or gone. Facebook never advertised. Google never advertised. Goodreads, Shelfari and LibraryThing never advertised either. The web has become so efficient at chanelling information that good services no longer need to promote themselves.

Libraries don't need a promotional champion. They don't need a "switching station." They to provide good things—and actually take part in the web!

3. As you write "WorldCat.org, introduced in 2006, is a response to brick-and-mortar libraries' loss of attention in the Web landscape." Well, pardon me for saying so, but if that's libraries' response, they're screwed.

There's a library in practically every town in America and at ever college too, and WorldCat aims to represent them. But despite extensive promotion, mostly to librarians, WorldCat is a fringe website. It gets less than 0.7% the visitors of Amazon. Multiple other web bookstores--Abebooks, Alibris--surpass it. A single physical bookstore, Powells in Portland, Oregon, gets more traffic, as do both LibraryThing and Goodreads--niche sites, as much as I love them. Instead, WorldCat is wrapped in a traffic dance with Dogster, the social network for people who *really* love their dogs. Add Catster and you're toast.

4. WorldCat has a "commitment to comprehensiveness" but the service is optimized for revenue, not eyeballs. True "switching mechanisms" like Google don't try to suck money from content providers. But WorldCat is an expensive service. Many libraries can't pay, so WorldCat pretends to a comprehensiveness it simply doesn't have.

Looking in WorldCat for the "Da Vinci Code" I am led to believe that no public library has the book short of the New Hampshire border. Indeed, I can't spot a single Maine public library on WorldCat, and it doesn't even have the University of Southern Maine. I could grab a book from a dozen public and university libraries within fifteen miuntues driving, and, except for the University of New England, WorldCat doesn't have any of them.

If WorldCat were serious about comprehensiveness it would allow any library anywhere to upload its MARC records for free. Maybe you could charge $25 each. If you could get every library to do that it would be more than enough to run the service.

Karen Calhoun Author Profile Page said:

Hello Tim,
So you don't care for OCLC's strategy with WorldCat.org. Many libraries--and the members of the communities they serve--do. As I've said before, OCLC is well aware that .org needs to be better and our team works constantly to improve the switching mechanisms and add libraries.

I don't know where you got your web traffic data but the service I just consulted (Alexa) places WorldCat.org's traffic ranking ahead of LibraryThing's. I wondered about the rest of the rankings in your comment but did not check them.

WorldCat.org is one of five interfaces to the database, not counting the WorldCat API. All the other four interfaces are busy, and none of their traffic gets counted by the web trackers. Neither does the traffic generated by the API. More statistics about WorldCat usage are available at http://www.oclc.org/worldcat/statistics/default.htm and http://www.oclc.org/worldcatalog/overview/statistics/default.htm

Regarding the legal standing of the FAQs, both they and the revised policy have been set aside pending the Review Board's recommendations, so the point is moot at this time. I will make sure your comment about E7 is considered as they review the policy. Karen

Tim Spalding said:

I don't know where you got your web traffic data but the service I just consulted (Alexa) places WorldCat.org's traffic ranking ahead of LibraryThing's.

I use Compete, which samples ISP content. Quantcast does too (it's showing 738 LT vs. 476k WC). Alexa is based on toolbar hits. Do you have the Alexa toolbar installed? Neither do I. For a while it was very skewed by the fact that it caught on big in Korea. I don't know if that's still true.

I should add that all these stats are half bogus. Alexa is the worst, but they're all subject to problems. LibraryThing and some friends with book sites are all undercounted by 50% in Compete. I'm guessing WorldCat is too--that there's some systematic bias against serious sites. I'm quite willing to believe that WorldCat is doing better than LT. Maybe it's doing a lot better.

The point is that we are of roughly the same order of size. WorldCat should be doing a LOT better—you're the gateway to tens of thousands of libraries. You've got a PageRank of 9. You could be king of the world. Why aren't you?

Ultimately, fifteen years ago when you wanted to find out information about a you went to a library. Now people go to a bookstore—Amazon. Libraries are exceedingly scarce in search results. Some of that is your fault, but mostly it's OPAC/ILS vendors.

If libraries aren't going to be on the web directly, WorldCat needs to do it for them effectively. And WorldCat just isn't doing what it could here. LibraryThing and OCLC are somewhat at loggerheads, for sure, but I think it would be exceedingly great if ANY library site showed up in the first few results for most books. I think you could do it--with more buy in from libraries and, frankly, with some search-engine visibility work.

I used to do SEO. Want to hire me? ;)

Roy Tennant said:

Tim, I think it's interesting that you are fascinated by us but then I suppose I shouldn't be surprised. After all, your entire business model is built on the fact that you can use catalog records for free that others created and not contribute anything back unless they pay (yes, there is a limited set of data available via an API, but then they need the chops to do something with it). Good luck with that.

Meanwhile, I'd like to answer a few of your points. I'll leave the traffic one alone since the data is flawed to begin with and it hides the fact that we're all working to get more exposure.

>1. The web already has "switching
>mechanisms."...

We’re not an intercessor, we’re an enabler. There are two basic mechanisms for library collections to be discoverable in search engines — 1) every library opens up their catalog to crawling, and 2) we provide a switching service. The first option is lunacy, since even if it were possible (and given current ILS systems it is not very probable) it would lead to a ridiculous user experience. Even if a searcher could locate a book record by specifying a certain library in the search, there is no mechanism to find other, nearby libraries where someone may also have borrowing privileges. For example, a student at Stanford can likely borrow books at Stanford, the Palo Alto City Library, and members of the Silicon Valley Library System such as Mountain View Public Library. Try qualifying a search for those in Google. Meanwhile, via WorldCat, it is one “Find in a Library” click away: .

>Libraries are not part of the web.

It is exactly because of this that we are working so hard to put libraries on the web in usable ways. We’re glad you think this is as important as we do.

>2. The whole "eyeballs" idea is out of
>touch with how the web works. ...

Again, individual libraries are unlikely to be able to "take part on the web" as well as we can on their behalf. We can sit down with Google and get direct links to library collections through us — individual libraries would never be able to do this on their own. Nor should they have to.

>4. WorldCat has a "commitment
>to comprehensiveness" but the service is
>optimized for revenue, not eyeballs. ...

We continuously look for ways to reduce costs to libraries and increase participation in the cooperative. Also, it may not be widely known that small libraries can catalog using WorldCat for $125-$200. See . Your suggestion that we could support a library in any effective way for $25 a year is quaint. When did you last cost out staff to answer support phone calls and email? But yes, there is more we can do here and we will be pursing it as we continue to seek ways to reduce costs for our member institutions. Perhaps you can forgive us if we aren’t as nimble as we could be if we weren’t as concerned about costs to members and our need to make sure the investment they have made in this cooperative continues to pay off for many years to come. Unlike many commercial entities that can be “successful” if their principals retire in style, we’re all about service.

About this blog

Metalogue is a forum for sharing thoughts on all things related to knowledge organization by and for libraries, hosted by Karen Calhoun, Vice President, WorldCat and Metadata Services for OCLC. Karen is joined often by friends and colleagues from all over the globe, who contribute perspectives and experiences about the current and future state of cataloguing and metadata.

Find In A Library

Search for an item in libraries near you:
WorldCat.org »