August 2012 Archives

This last Spring, Richard Wallis joined OCLC--you can read the news release with all the details. His new job title is "Technology Evangelist" which, he tells me, means that he's supposed to talk as often as possible, in as many venues as possible, about emerging technology that's relevant to OCLC members and the library environment at large.

With that in mind, an interview seemed like a good idea to us both. I got a chance to sit down and chat with Richard when he was here in our Dublin, Ohio office. Part one of our interview is below, and we'll be adding the second part within the next week.


Andy Havens: First off... Welcome to OCLC! I've been reading your stuff for years and listening to your Library 2.0 Gang podcasts and other presentations. So it's great to have you onboard.

Richard Wallis: Thanks, Andy. It's great to be here. Everybody's been very friendly and welcoming and I'm having a lot of fun so far.

Andy: Glad to hear it. Today, if it's OK, I'd like to talk about two things. First, why and how you joined OCLC. Then I'd like you to explain linked data to me in a way that makes sense for a "non-technology-evangelist."

Richard: [laughing] Sure, sure. But is it OK if I start with linked data? I think explaining that will then lead pretty naturally into my recent move to OCLC.

Andy: Sounds great.

Richard: OK. So... how much do you know about linked data?

Andy: Let's pretend I know nothing.

Richard. Fair enough. Well you know something about how the Web works, yes?

Andy: Sure. At least the basics; HTML, links, URLs. That kind of stuff.

Richard: Well then, you know almost everything you need to know about linked data.

Andy: ???

Richard: Seriously. When Sir Tim Berners-Lee created the Web, the idea was to make it as simple as possible to connect documents, which are usually Web pages. To do that you need three things: a unique identifier for each document or Web page (a URL or URI); a common language for marking up and displaying documents (HTML); and a protocol to transmit the contents (HTTP).

Andy: I'm with you so far. So the Web is kind of a "linked documents" environment?

Richard: Yes. Now, the thing about linking documents is that a link only "knows" two things: the page it's on, and the page it's linking to. The ease of that relationship is part of what helped the Web grow so quickly. You don't have to register a Web page or an image or a file in some giant, unified index of Web resources. You just put it out there and let people link to it.

Andy: I get that. But documents aren't data. They're data, well... in context.

Richard: That's right. And linked data on the Web takes linking one step further by providing that context. When you link two documents, you're missing a key piece of information: why?

Andy: What do you mean? If I create a link, I know why I'm doing it.

Richard: You do, but without analyzing the text around it, a user doesn't. And a machine might make a good guess - that's a lot of what search engines do; provide guesses of good Web pages based on what we link to - but it can't know exactly why you did it.

Andy: Can you give me an example?

Richard: Sure. The classic example is one of disambiguation. You link the word "Columbus" to a Web page about Christopher Columbus. Someone else links it to a map of Columbus, Ohio. The search engine, and some users, will have to guess about the meaning. Also, as you said, a page has a lot more context.

Andy: So in linked data a link is a link with context? Richard: Right. In linked data terminology, we call that a "triple," because the format for the data in RDF -that's Resource Description Framework, a linked data syntax standard--requires three parts. Just like a simple sentence, a triple involves a subject, predicate and object. Let me draw you a picture, it might help...


Andy: So in the example you gave...

Richard: You'd say "Columbus," (subject) "is the last name of" (predicate) "the 15th century explorer" (object). That way, in this case, we'd know you're not talking about the city.

Andy: Got it. But in your picture, the object can be either some data or another URI?

Richard: That's right... and that's what makes linked data so powerful. It's not just linking any two pieces of data - our concept of Columbus can not only be linked to a name but also to another concept with say a place of birth relationship. So to update our diagram ....


Andy: So if someone else provided a really good description of Columbus, my triple could point to that, rather than me making up my own definition.

Richard: That's right. And that's where OCLC comes in, and why I'm glad to be in this place, at this time.

Andy: That was a clever link itself... and I think we'll follow up from there when we continue the interview later.

Richard: Works for me.

Andy: Thanks. Anything else to add before we close?

Richard: Two quick things. First, I will be hosting the Linked Data Roundtable at IFLA in Helsinki. I'd encourage anyone who's going to be at IFLA and who is interested in linked data to register and attend. Also, I believe you've just posted a short YouTube video that explains the basics of linked data and how it's important for libraries. I thought a quick pointer to that might be in order.

Andy: Both good links to include in the interview. Thanks again, Richard, for your time... and we'll see you soon for part two. Stay tuned for part two of our interview, where Richard shares why he joined OCLC and what linked data had to do with that decision.