Book Review: Linked Data for Libraries, Archives, and Museums: How to Clean, Link and Publish Your Metadata

Book Review: Linked Data for Libraries, Archives, and Museums: How to Clean, Link and Publish Your Metadata

Linked Data for Libraries, Archives, and Museums: How to Clean, Link and Publish Your Metadata. By Seth van Hooland and Ruben Verborgh. Chicago: Neal-Schuman, 2014. 278 p. $85.00 softcover (ISBN 978-0-8389-1251-5).

For the past few years, librarians have heard how Linked Data will be the future of bibliographic data. Linked Data for Libraries, Archives, and Museums: How to Clean, Link and Publish Your Metadata tries to make sense of the hype. The goal of this book is to introduce “the process of making your collections available, from the arduous processes of cleaning and connecting to publishing it for the world” (xiv). Specifically, this book describes metadata standards including Linked Data, associated tools and technologies, and the sustainability of metadata and technologies. The authors critically evaluate various options that can be used to clean, enrich, and publish metadata along with the history, advantages, and disadvantages of each.

Both authors Seth van Hooland and Ruben Verborgh are metadata specialists. Seth van Hooland holds a PhD in Information Science, is an assistant professor at the Université libre de Bruxelles (ULB), and is the academic responsible for the Information and Communication Science Department. Ruben Verborgh is researcher in semantic hypermedia at Ghent University—iMinds Belgium and holds a PhD in Computer Science Engineering. Together they created a website, http://freeyourmetadata.org, to provide learning materials on metadata.

This book is well organized and thought out. The authors describe their approach as “pragmatic.” A glossary at the front defines the terms used, and helpful illustrations and tables are included to highlight the concepts presented. The introduction clearly lays out the structure of the book by giving the goals, audience, concepts, and skills of each chapter. Each of the five core chapters end with a case study that allows practice of the concepts presented in the chapter. The case studies are easily identified in the text by a grey vertical bar in the margin. The authors note that three of the five case studies use OpenRefine software. The sample data used in the case studies is based on real-world data and is available for download at the website, http://book.freeyourmetadata.org. This book takes on a tough task of trying to reach a diverse audience of library, archives, and museum professionals and their various needs. In order to relate to all, the authors use case studies from a wide range of institutions: Schoenberg Database of Manuscripts, Powerhouse Museum, British Library, and Cooper-Hewitt National Design Museum.

Each successive chapter—“Modelling,” “Cleaning,” “Reconciling,” “Enriching,” and then “Publishing”—builds on the concepts presented. They all begin with a bulleted list of clearly stated learning outcomes and contain a conclusion summing up the concepts presented. The chapters take the reader through the steps needed to make metadata accessible on the web.

“Modelling” provides an overview of four data models: tabular data (flat files), relational model (databases), meta-markup languages (XML, etc.), and Linked Data. Diagrams depicting each model and a summary table listing the advantages and disadvantages of each are included.

“Cleaning” claims that most metadata needs to be cleaned. The focus is on improving data quality and some common data problems. It introduces the term “data profiling”—determining the quality of data and then enhancing the quality. The case study introduces OpenRefine.

“Reconciling” discusses controlled vocabularies, in particular the importance of sharing and re-using vocabularies. Included are introductions on full-fledge vocabularies, such as Library of Congress Subject Headings (LCSH), and a lightweight web approach, Simple Knowledge Organizing System (SKOS), which is often used with Linked Data.

“Enriching” details various ways of augmenting unstructured metadata. The main focus is on applying Named-Entity Recognition (NER) to metadata as a way to identify and disambiguate unstructured data.

“Publishing” describes how to publish your metadata in a sustainable way. In particular, there is a discussion of Application Programming Interface (API) vs. Representation State Transfer (REST) technology. APIs are considered as unique to each system while REST is a global standard that is re-usable.

The conclusion discusses the global impact of Linked Data. It lacks a summary chapter connecting the concepts presented together. Additionally, the discussion on Resource Description Framework (RDF) includes the statement, “We purposely do not provide examples here, as there are many on the web” (210). Yes, there are many examples on the web, but many of them are bad. This book should provide more “good” examples of publishing Linked Data and its associated syntax in “Publishing.”

In the introduction, the authors declare that this is the first handbook designed specifically for library, archives, and museum professionals and does not require a computer background. Though not a handbook, librarians may remember the Library Technology Reports publication, Linked Data Tools: Connecting on the Web by Karen Coyle that provides an overview of Linked Data that is geared towards librarians.1 Even though the authors say that the book is aimed at a non-technical audience, knowledge of structured data and query language is necessary to follow some of the discussions and the case studies. But in general, the concepts presented are in a clear and concise manner. Also, for librarians, the detailed explanation of classification, subject headings, and thesauri in the chapter titled “Reconciling” may be unnecessary, but would be useful for students and others.

After reading this book, readers may feel that there is a lot more to Linked Data than they thought. The authors organize the information in a logical way to help sort out the various options. Plus, they emphasize the importance of cleaning up data before publishing. Because of the volume of valuable information about metadata in this book, Linked Data for Libraries, Archives, and Museums: How to Clean, Link and Publish Your Metadata would be appropriate for those wanting to learn more about Linked Data and metadata. Working though the examples on the author’s website would reinforce the concepts presented.—Lisa Romano (Lisa.Romano@umb.edu), University of Massachusetts Boston

Reference

  1. Karen Coyle, Linked Data Tools: Connecting on the Web (Chicago: American Library Association, 2012).

Refbacks

  • There are currently no refbacks.


ALA Privacy Policy

© 2024 Core