June 2010 Archives
OCLC was recently asked to provide an estimate of the number of books held by US libraries that were published outside of the United States. Our answer? Approximately 200 million. We thought readers would be interested in learning the details of how the estimate was obtained.
In MARC records, the 260 field is the most obvious 'place of publication' field. Specifically, the 'a' and 'e' subfields are designed to record "place of publication, distribution, etc." and "place of manufacture", respectively. The problem is that these fields are filled with names - strings of characters - rather than codes, which are much more reliable and easy to parse.*
Such codes are found in the 008 header field. Using those as the focus, we made the following assumptions to answer the question at hand:
- Books are defined as monographic (bib level = 'm') language material (record type = a). This has the effect of including some materials that are not strictly speaking, "books" - pamphlets, broadsides, etc.
- English language Books lacking a known place of publication were assumed to be published in the US; the non-English language books were assumed to be published outside the US.
- Pre-1923 publications were excluded for purposes of copyright analysis.**
- The sample size was 1,700,000 records (roughly 1% of WorldCat).
- We used holdings data to determine how many copies of each title were owned by US libraries. For this holdings count, only the holdings of US libraries were considered.
Given these assumptions, we found the following:
Book titles published in the US: 26,710,400
Book titles published outside the US ("Foreign"): 78,017,300
Foreign book titles, held by US libraries: 22,801,900
Copies of foreign book titles ("holdings") worldwide: 461,596,000
US libraries' holdings of foreign book titles: 203,953,200
* In addition, while 260 ‡a (Place of publication) is common, the ‡e (Place of manufacture) only appears in less than 3% of records, making any combined analysis of these fields statistically shaky.
**If the pre-1923 cutoff is ignored, the number of foreign book titles increases by roughly 25%.
Jenn Riley of the Indiana University Libraries has released an intriguing graphic entitled "Seeing Standards: a Visualization of the Metadata Universe."
The image not only identifies and classifies 105 standards - it also evaluates them on "strength of application" in multiple axes. This judgment is based on level of adoption, design intent, and overall appropriateness. Outside of the massive labor that the research and analysis must have taken, the visual presentation is stunning. The work of Devin Becker on the graphic design should be commended.
The graphic is a useful addition to the literature and an excellent way to brush up on some standards in unfamiliar domains. The timing is excellent as well, as the American Library Association Annual Conference this week kicks off a season of acronym-filled meetings.