How Many "Foreign" Books Are in US Libraries?

|
Comments Comments (2) | Bookmark or Share Bookmark or Share
The following is a guest post by Ed O'Neill. Ed is a Research Scientist at OCLC.



OCLC was recently asked to provide an estimate of the number of books held by US libraries that were published outside of the United States. Our answer? Approximately 200 million. We thought readers would be interested in learning the details of how the estimate was obtained.

In MARC records, the 260 field is the most obvious 'place of publication' field. Specifically, the 'a' and 'e' subfields are designed to record "place of publication, distribution, etc." and "place of manufacture", respectively. The problem is that these fields are filled with names - strings of characters - rather than codes, which are much more reliable and easy to parse.*

Such codes are found in the 008 header field. Using those as the focus, we made the following assumptions to answer the question at hand:

  • Books are defined as monographic (bib level = 'm') language material (record type = a). This has the effect of including some materials that are not strictly speaking, "books" - pamphlets, broadsides, etc.
  • English language Books lacking a known place of publication were assumed to be published in the US; the non-English language books were assumed to be published outside the US.
  • Pre-1923 publications were excluded for purposes of copyright analysis.**
  • The sample size was 1,700,000 records (roughly 1% of WorldCat).
  • We used holdings data to determine how many copies of each title were owned by US libraries. For this holdings count, only the holdings of US libraries were considered.

Given these assumptions, we found the following:

Book titles published in the US:                       26,710,400
Book titles published outside the US ("Foreign"): 78,017,300
Foreign book titles, held by US libraries: 22,801,900
Copies of foreign book titles ("holdings") worldwide: 461,596,000
US libraries' holdings of foreign book titles: 203,953,200

Notes:

* In addition, while 260 ‡a (Place of publication) is common, the ‡e (Place of manufacture) only appears in less than 3% of records, making any combined analysis of these fields statistically shaky.

**If the pre-1923 cutoff is ignored, the number of foreign book titles increases by roughly 25%.

Comments 2 Comments

Bryan said:

You mention "[t]he problem [with 260] is that these fields are filled with names - strings of characters - rather than codes, which are much more reliable and easy to parse."

Assuming we retain this area into the future and assuming the use of MARC for the next 10 yrs, do you think the trend will be eventually to switch to codes for places and publisher names in 260 rather than strings of characters? Surely codes for places and publishers already exist and could be adapted for our use in bibliographic data. Do the publishers have the place and publisher names in coded form already?

In this area, if not others, perhaps the cataloging rule could be "Take/transcribe what you see only in the absence of coded values."

Bob W said:

I'm assuming from the methodology that these figures would include e-books, if the original was published outside the US. (E.g., netLibrary copies of UK publications) Any way to know how the numbers would change if limited to print?

About this blog

Metalogue is a forum for sharing thoughts on all things related to knowledge organization by and for libraries, hosted by Karen Calhoun, Vice President, WorldCat and Metadata Services for OCLC. Karen is joined often by friends and colleagues from all over the globe, who contribute perspectives and experiences about the current and future state of cataloguing and metadata.

Find In A Library

Search for an item in libraries near you:
WorldCat.org »