Book Review: Linked Data for the Perplexed Librarian

Judy Gitlin

06_BR

Linked Data for the Perplexed Librarian. By Scott Carlson, Cory Lampert, Darnelle Melvin, and Anne Washington. Chicago: ALA Editions, 2020. 164p. $59.99 softcover (ISBN 978-0-8389-4746-3)

In the introduction, the authors of this concise work state that their aim is to present the basics of linked data specifically to librarians “whose background may not be traditionally considered ‘technical’” (x). They state that the book “is a great primer on linked data basics, it is not an exhaustive dive into the topic, nor is it intended to make you an expert” (xi). To a large degree, the authors meet these objectives, at least to this reviewer who is not of a technical background.

The authors realize that members of the linked data community often have difficulty communicating with less technical library staff. They also acknowledge that while linked data is widely discussed at conferences and in the library literature, it is applied in real life only at larger institutions with larger budgets, staff, and institutional support. Meanwhile, other librarians wonder what relevance linked data has to their work.

First, the authors define linked data, which is data that can be read by both humans and machines. The authors begin with a history of linked data, which is also really a history of the internet, and more specifically the World Wide Web. They next discuss the “Semantic Web,” which utilizes linked data. In one of many useful examples in the book, they explain the “Semantic Web” by depicting things your brain is likely to know about Charlie Chaplin, such as that he was a person with a movie career, even if you are not an expert in the history of show business. The authors then illustrate how linked data, in the form of the “Semantic Web,” can make connections such as between Charlie Chaplin and the studio that produced his movies. That studio also produced movies by his friends Douglas Fairbanks and Mary Pickford, and later produced James Bond movies. This reviewer found examples such as this, and the accompanying graphics, to be quite helpful in illustrating how the data connections are made.

The book proceeds on to a history of Machine-Readable Cataloging (MARC). Not surprisingly, a format designed for a printed environment does not work for linked data, as it cannot create connections among various forms of data. It is also unknown outside of the library world. In the “About the authors” section, all four authors name their favorite P-Funk album(s). This reviewer (who read that part first) initially thought this was an interesting quirk, but it turned out to be a very important factor in their presentation of linked data. They used the example of a library receiving a donation of an extensive collection of funk and soul vinyl records. In another useful example, the authors describe the difficulties in using MARC to convey that an album may have both an LP and a CD version. It also cannot convey that an album is a collection of songs, because the MARC 505 (contents note) and 740 (Added Entry-Uncontrolled Related/Analytical Title) fields have weaknesses. The authors describe Resource Description Framework (RDF) as a linked data means of showing connections among various data points. For example, George Clinton is both a performer and a producer of musical albums.

As the book proceeds, it becomes more technical, as the authors warned in the introduction. They proceed to describe Universal Resource Identifiers (URIs) as a way to identify data in a format readable to both computers and humans. Next, they describe various linked data formats, including SPARQL, RDF/XML, N-Triples, Turtle, and JSON-LD. The descriptions include sample data presented in the various formats. The next chapter is devoted to ontologies and linked data.

Real world examples are also presented, explaining how a Google search might determine whether someone is searching for “Black Panther” the movie or “black panther” the animal. The authors also describe social media and Wikidata. The next few chapters attempt to answer the questions many librarians have: When and how are they going to use linked data? It already is used in libraries: Library of Congress name authority headings. They then proceed to the inevitable discussion of the Bibliographic Framework Initiative (BIBFRAME), the data model that is intended to eventually replace MARC. However, BIBFRAME, like linked data, is often discussed but seldom used in the general library world.

The authors wrap up their slim volume with some sample library projects that use linked data, such as doing inventories of data, exploring tools and technologies, remediating and enhancing metadata, using graphs and RDF, using linked data in communities, experimenting with real world data, assessing linked data, and finally, and perhaps most importantly, making the case for using linked data. While it was good to see these projects and it would be interesting to explore them further, it still seems unlikely that many linked data projects will be initiated in any but the largest institutions, especially as all kinds of libraries must reinvent their everyday services and activities in light of the 2020 coronavirus pandemic. In the concluding chapter, the authors acknowledge that libraries are not currently using linked data, which means that an obsolete format continues to be used in part due to technical and staffing issues. The authors succeed in increasing the reader’s understanding of linked data and the benefits of implementing it.

This volume was an engaging introduction to the topic of linked data. The good use of examples conveyed understanding of a technical topic to this non-technical reviewer.—Judy Gitlin (judith.gitln@dc.edu), Dominican College

Refbacks

There are currently no refbacks.

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

ALA Privacy Policy