lrts: Vol. 58 Issue 3: p. 187
Metadata Makeover: Transforming MARC Records Using XSLT
Violeta Ilik, Jessica Storlien, Joseph Olivarez

Violeta Ilik (vilik@library.tamu.edu) is Assistant Professor, Semantic Technologies Librarian, Texas A&M University, College Station, Texas. Jessica Storlien (jessicahennen@gmail.com) is Reference and Instruction Librarian, Blinn College, Brenham, Texas. Joseph Olivarez (jolivare@library.tamu.edu) is a Library Specialist III, Texas A&M University.
The authors would like to express the deepest appreciation and gratitude to Nancy Burford, Veterinary Collectors Curator, for her continuous support and generous help through the process. The authors would also like to express special appreciation and thanks to Anne Highsmith, Director of Consortia Systems, for encouraging our research and for helping us write the XSLT for the first experiment.

Abstract

Catalogers have become fluent in information technology such as web design skills, HyperText Markup Language (HTML), Cascading Stylesheets (CSS), eXensible Markup Language (XML), and programming languages. The knowledge gained from learning information technology can be used to experiment with methods of transforming one metadata schema into another using various software solutions. This paper will discuss the use of eXtensible Stylesheet Language Transformations (XSLT) for repurposing, editing, and reformatting metadata. Catalogers have the requisite skills for working with any metadata schema, and if they are excluded from metadata work, libraries are wasting a valuable human resource.


Being a cataloger requires more than knowledge and understanding Machine-Readable Cataloging (MARC) format, the Anglo American Cataloging Rules (AACR2), the Library of Congress Descriptive Cataloging Manual (DCM), Library of Congress’ Subject Headings Manual (SHM), Library of Congress Classification (LCC), Resource Description and Access (RDA), and other cataloging rules and standards. Cataloging practices must also embrace the opportunity to employ new schemas for resource description and how to reuse and repurpose existing metadata.

In the current library ecosystem, catalogers must be willing to assume new responsibilities to enable information to be organized, repurposed, and shared with patrons and other libraries. A large part of these new responsibilities are grounded in the importance and use of metadata to meet the needs of libraries, including creating interoperable data, repurposing data, and building digital repositories. Catalogers have the fundamental skills to successfully work with and repurpose metadata. These skills include, but are not limited to, organization of information, knowledge of commonly used access points, and a growing knowledge of information technology. Catalogers must also develop fluency in information technology (IT) including HyperText Markup Language (HTML), Cascading Stylesheets (CSS), eXensible Markup Language (XML), Extensible Stylesheet Language Transformations (XSLT), and MARCXML (an XML schema based on MARC21) to expand and reimagine their work. By encouraging catalogers to work with metadata creation and standards and to learn web development skills, libraries are using their resources and staff efficiently. This paper will explain how catalogers with intermediate knowledge of HTML, CSS, and XML can develop stylesheets to transform or enhance XML documents.


Literature Review

A survey conducted in 2007 by Ma investigated how metadata was implemented in Association of Research Libraries (ARL) member libraries and revealed that the metadata qualifications and responsibilities required by most responding institutions included knowledge of MARC cataloging, advanced knowledge of metadata crosswalks, and that knowledge of XML and the Open Archives Initiative (OAI) were considered desirable.1 Park, Lu, and Marion analyzed cataloging position descriptions for vacancies posted on the Autocat discussion list between 2005 and 2006.2 Their results revealed that of the required qualifications, computer skills (including, but not limited to, hardware, software, Microsoft Office applications, word processing, spreadsheets, and Microsoft Windows) appeared in 32.1 percent of the postings. Metadata knowledge, including but not limited to, Dublin Core (DC), Encoded Archival Description (EAD), Metadata Object Description Schema (MODS), Text Encoding Initiative (TEI), and Visual Resources Association (VRA) appeared in 23.5 percent of the postings. Web knowledge, including but not limited to, the World Wide Web, HTML, Standard Generalized Markup Language (SGML), XML appeared in 16.3 percent of the postings. Results reveal that advances in technology have created a new realm of desired skills, qualifications and responsibilities for catalogers.

Hseih-Yee asserts that although catalogers may not be involved in writing programming codes, they need sufficient knowledge of the technologies and tools affecting information organization and services to communicate with vendors and systems management units.3 The current trend is for catalogers to be involved in learning to code through various collaborative venues. Calhoun advocates for librarians to become IT-fluent to better support the future of library information dissemination.4

Reese maintains that as more institutions bring collections online, technical services staff will continue to face the growing issue of distributed metadata retrieval.5 He further states that technical services departments have viewed metadata harvesting and transformation as the responsibility of library technology departments. He discusses Texas A&M University Libraries’ method for metadata repurposing developed by Surratt and Hill in 2004 and notes that data conversion was moved outside the cataloging department, creating a barrier between catalogers and the developers of the script.6 Reese is the creator of the free Windows-based MARC editing tool, MarcEdit (marcedit.reeset.net), which can be used in an everyday cataloger’s work and provides a means for editing metadata, XML crosswalking, and metadata harvesting via the Open Archive Initiative Protocol for Metadata Harvesting (OAI-PMH). With this tool, Reese proves that new tools and workflows will continue to be developed and more technical services departments will turn to metadata harvest and capture as a viable method of generating metadata for digital collections.7

Tosaka stresses the importance of metadata transformation to enable reuse, and he cites the need for additional studies on metadata interoperability and crosswalks.8 He also states that the ability to repurpose metadata is important due to the “increasingly global, interdisciplinary environment where users must deal with metadata records from multiple databases with their individual data structures.”9 Tosaka deduces that collections of malleable, shareable metadata are in demand, and this must be considered by current data-creation standards.10 The workflow used by the University of Illinois Library “implements the eXtensible Markup Language and Open Archives Initiative-Protocol for Metadata Harvesting to create cataloging records for digitized books.”11 The author further makes the point that libraries have to develop these new workflows to be able to keep abreast of the abundance of digitized books produced by mass digitization projects.

Woodley provides these simple definitions for mapping and crosswalks: mapping compares and analyzes two or more metadata schemas while crosswalks are the product of the mapping process.12 St. Pierre and LaPlant define crosswalks as the “specification for mapping one metadata standard to another.”13 Metadata crosswalks are central to ensuring seamless access to information from various systems and essential in converting data from one format to another. Woodley points out the enormity of the tasks involved in repurposing metadata, including transforming records from one schema to another and merging records created using different schemas or standards.14 A similar study, conducted by Rudic and Surla, discusses “the application of the XML technologies for the conversion of the bibliographic records between the different bibliographic formats (YUMARC, UNIMARC and MARC 21)” used in the library system in Serbia.15

Before initiating a metadata crosswalk, it is important to have an awareness of common issues associated with metadata and crosswalks. According to Dushay and Hillman, the four categories of metadata problems are missing data, incorrect or erroneous data, confusing or inconsistent data, and insufficient data.16 Woodley states that common issues with migrating data include ambivalent matches, hybrid bibliographic records, data mapping to multiple fields or combining into single fields during migration, orphaned data parsed into incongruous fields, mixed standards in original data, MARC data loss during the migration, and flat structure versus hierarchical structures.17 St. Pierre and LaPlant describe similar issues with metadata crosswalks, including reconciling metadata organization systems, choice of unanalogous processes during metadata standards creation, imprecise definitions or alternate naming choices that inhibit element-to-element mapping, information being lost or combined during mapping, and unharmonious hierarchical structures.18 Godby, Young, and Childress note that a problem with crosswalks is they are not always identified as a standard, and point to the digital library community’s opposing views on crosswalks.19 One view maintains that crosswalks are a stopgap measure, a local and temporary solution, until a single data standard is developed. The other view asserts that crosswalks “represent an attempt to identify interoperable elements among standards”; this implies that crosswalks should become a standard practice.20

According to Chandler and Westbrooks, construction of open systems and methods that provide the ability to link between different types of metadata is the path to information discovery.21 They also discuss the need for the library technical services departments to be proactive not only in the creation and maintenance of non-MARC metadata, but more importantly, in the development of a means for widely sharing metadata with libraries that require it for resource discovery and access. Ahronheim and Marko cite the need for catalogers to participate in both the development and use of descriptive standards because standards easily allow data to be reused.22 Woodley states, “Consistently recorded, reliable metadata can be reused and combined with metadata records that have been created according to different standards to create richer, more informative information objects.”23 Calhoun concludes that metadata can and should be reused, and libraries must ensure interoperability of their metadata for this very reason.24

The appeal to create interoperable data has led to a discussion of using XML to manipulate data, including MARC data. Johnson ascertained that XML and HTML standards closely identify with web development, allowing for effective integration with web interfaces.25 XML is inherently hierarchical, web-oriented, works well with web applications, and allows for retrieval and manipulation of data. XML is considered an unencumbered format that facilitates the ability to experiment with library data, for example, transforming MARCXML records into various other schemas. Johnson asserts “XML presents new opportunities and extended life for MARC” because XML can transfer MARC encoded information.26 Due to the dual roles of librarians as information consumers and metadata producers, it seems natural that librarians would influence and participate in the development of XML software and applications. The authors of this paper propose that catalogers who are empowered by library administrators and are encouraged to assume new responsibilities can acquire skills such as programming languages and combine them with existing cataloging expertise to successfully implement metadata projects.


Using XML and XSLT

Catalogers at the Texas A&M University Libraries chose XSLT to manipulate XML records because of its relative ease of use. According to Keith, stylesheets are the ideal format for the maintenance of XML data transformations due to the native adaptability and simplicity of stylesheets.27 No special software is needed to change stylesheets. He further states that because XSLT documents are simple text files, just like XML documents, they can be edited with word-processing software at the most basic level.

Understanding of HTML and CSS is essential to comprehending XML and XSLT. Tennison states that XML is a meta-markup language specifically designed for ease of use with the web that is human-readable and straightforward for applications to read and understand.28 An XML document written in XSLT is commonly known as an XSLT stylesheet and is usually assigned the file extension .xsl.29 Each XSLT stylesheet describes how a set of XML documents (the source documents) should be converted to other documents (the result documents), whether they are eXtensible Stylesheet Language Formatting Objects (XSL-FO), eXtensible HyperText Markup Language (XHTML), comma-delimited text, or any other text-based format, such as HTML. An XSLT stylesheet will typically take source documents written in one markup language and produce a result in another markup language, which can be used by a specialist application, such as XHTML, for presentation in a browser.30

To perform a transformation, XSLT must be capable of pointing to information in the source document to process the information and include it in the result. XML Path Language (XPath) serves as the guide for XSLT.31 The most important role of XPath is to collect information from an XML document by navigating through the document. A good XML editor, such as oXygen, is necessary to experiment and work with XML and XSLT. The Library of Congress (LC) developed the MARCXML architecture and MARCXML toolkit to standardize the exchange of MARC structured data in XML. The core of the MARCXML framework is a simple XML schema that contains MARC data. As stated on the LC website, this base schema output can be used where full MARC records are needed or can act as a “bus” to enable MARC data records to go through further transformations, such as to DC or processes like validation.32 Control fields, including the leader, are treated as data strings, while variable fields are treated as elements and the tag and indicators are treated as attributes. Subfields are treated as sub-elements with the subfield code as an attribute.

The MARCXML schema provides easy access to discrete pieces of data, such as data stored in the leader, 008, and subfields, and enables the creation of XML stylesheets to manipulate and transform the data.33 Information is accessed at the subfield level with simple XPath expressions. Many types of transformation exist, such as MARCXML to DC, MODS, or just simple style sheets that allow enhancement of MARCXML files. Keith states that the MARC21 file format is not easy to modify or transform to other schemas, and it is not common for a software developer to understand MARC.34 A well-written specification is required to manipulate MARC metadata. Unlike MARC, XML is simple, although Coyle notes that it allows for the creation of complicated data records.35 She explains XML by comparing it to MARC tags, such as the use of “245,” which in XML would be written as <title>Title of the book</title>.

A major difference between MARC and XML is that XML uses brackets to denote the beginning tag <> and ending tags </>. The tags need to be predefined in a data-format-definition structure. XML is hierarchical, as are MARC tags and subfields. However, XML is potentially more hierarchical as there is no limit to the number of levels, unlike MARC, which is limited in number by the established standard.

Catalogers at Texas A&M University Libraries used the stylesheets available on LC’s Network Development and MARC Standards Office site for the second experiment. As stated on the LC site, “this framework is intended to be flexible and extensible to allow users to work with MARC data in ways specific to their needs. The framework itself includes many components such as schemas, stylesheets, and software tools.”36 For the purpose of data transformation for sample records from MARCXML to DC, the authors adapted the LC MARCXML to Resource Description Framework (RDF) Encoded Simple DC Stylesheet. The use of this stylesheet produced the DC-RDF file format.

First Experiment: Creation of Bibliographic Records for Electronic Information Resources from Bibliographic Records for the Print Version of the Same Resource

The XSLT for this experiment was created by one of the three authors. As libraries frequently purchase electronic versions of resources owned in print, the authors believed it would save time and money to create MARC records for the electronic resources by reusing the existing metadata stored in the bibliographic records for the print versions. Using a simple XML editor, they created a stylesheet to transform bibliographic records for the print resources to bibliographic records for electronic version of same resources in MARCXML file format (see appendix A). The bibliographic records in MARCXML were transformed through the MarcEdit utility into MARC records ready for import into the local Integrated Library System (ILS).

The authors began by identifying fields in the original bibliographic records that would not be reused. There are several fields used in records for print resources that are unnecessary when describing electronic resources. Other fields that need to be removed are unique record identifiers, such as the OCLC number and the ILS bibliographic number. The command line for the removal of those fields is as follows:


After deleting the unnecessary print record fields, the mandatory fields needed in MARC records for electronic resources, such as the 006 and 007 fields, were added.


A command for changing the 008 form of item position to “o” for online was also created.


Other fields that are recommended for describing electronic books (e-books) were also added, such as the 300 field with the value of “1 online resource” in subfield a, the 588 field with a value of “Description based on print record,” and the 655 field with a value in subfield a of “Electronic books” and subfield “2” with a value of “local”:


After transforming the MARCXML file and creating a new MARCXML file for e-books, the authors processed the file using MarcEdit software to convert it to a MARC file to import it to the local ILS or to the Online Computer Library Center (OCLC) (see figure 1). The process described above could allow a cataloging department to streamline the creation of thousands of bibliographic record for electronic resources from bibliographic records for print materials the library already holds in the ILS and OCLC.

The transformation made to the bibliographic records for print versions of information resources to bibliographic records for electronic information resources is shown in appendix B. There are two examples of bibliographic records in appendix B: one for a print information resource (record 1) and one for an electronic information resource (record 2). The changes made to the bibliographic records for print resources are made using the stylesheet referred to in appendix A.

The time used by one of the authors in creating the stylesheet for this first experiment was approximately ten hours over a period of three weeks. The initial learning curve needed for the first stylesheet accounts for time consumed in its creation. With each subsequent stylesheet creation, the time required lessened to a little more than thirty minutes. Once the stylesheet was formatted, we were able to create thousands of new electronic resource bibliographic records in seconds via the transformative ability of the process.

Second Experiment: Transforming Sample MARCXML to Dublin Core Records

The next experiment involved the transformation of XML documents from one schema to another using XSLT, specifically the transformation of MARCXML to DC XML. The authors adapted the existing MARCXML to an RDF Encoded Simple DC Stylesheet, available on the LC website. This section of the paper explains the process and workflow for repurposing metadata.

The authors identified record sets for transformation from XML documents using XSLT. For experimental purposes, the authors selected the collection sets, bypassing consultation with subject selectors or stakeholders, as would be the case when working with a collection identified for digitization and ingestion into the library’s institutional repository. If a record set from the live catalog were being transformed, the collection would have been first identified by a selector, cleared by the copyright librarian for inclusion in the institutional repository, and then ingested into the repository. As this was an experiment, none of the preparatory steps were necessary.

It is important to note that Texas A&M’s local ILS is a relational database; therefore Microsoft Access was an excellent tool for use in querying the database. After identifying a collection set, an Access report was run to pull data from the local Voyager ILS. Initial queries involved gathering the bibliographic record ID numbers using two tables and limiting by series or call number in a third table. Depending on the collection, one can also run a Binary Large Objects (BLOB) query to search for information in any field of a bibliographic record. By querying the data, the authors obtained OCLC numbers from the 035 subfield a, in records identified in the initial queries as belonging to a collection set and fed them through the OCLC batch processing utility to obtain the desired MARC files from the OCLC database. The authors worked with OCLC records for this specific collection because those records might have been updated or enhanced since first imported into the local ILS, thus potentially conceivably providing richer metadata.

After preparing the MARC files, MarcEdit was used to convert the files into MARCXML documents. The first step was to convert the downloaded OCLC file to MRC file format using the “MarcBreaker” function (see figure 2). The second step created the MARC file in MRK file format with the “MarcMaker” function (see figure 3). The final MarcEdit step was the transformation of the MRK file into XML format with the “MARC=>MARC21XML” function (see figure 4). Once the files were in XML format, the authors experimented with them by applying different stylesheets and transforming them from MARCXML to DC XML files. They used Oxygen software for processing and transforming the metadata.

Each collection had specific requirements, and to meet those requirements and preserve and transform metadata from MARC to DC, the authors tailored stylesheets for each collection. They first created a collection profile to match the local DSpace DC metadata matrix. The next step was the addition of fields that would be the same for every item in a collection. These additional global fields included the following: dc:format.digitalOrigin, dc:format.medium, dc:type.material, dc:type, and dc:language. According to Hillmann, the recommended best practice for the values of the Language element is defined by RFC 3066 which, in conjunction with ISO 639, defines two and three letter primary language tags with optional subtags (see table 1).37 The authors used the following form: en-US.

The command line for adding the language element in the prescribed format is:


Another example for adding a field that applies to all the records from the collection:


The authors selected 5XX fields that were initially omitted from the records to avoid duplication and loss of data.

The next step involved the creation of a stylesheet that mapped MARC data to DC data fields. This would be the actual crosswalk from MARCXML to DCXML. For all the trial collections, the 245 MARC title filed was mapped to dc:title, while the 100,110 and 111 field were mapped to dc:creator. Another example of mapping is in the publisher element, where the subfield b from the 260 MARC tag was mapped to dc.publisher. For some collection sets, it made more sense to add a separate field with the name of the institution mapped to dc:publisher element instead of mapping the subfield b from the 260 field to the publisher element. This enabled more uniform and consistent publisher names for all the records, particularly as the authors discovered that the OCLC records were not of a consistently good quality and sometimes needed enhancement. This process could only be used when the authors knew that the publisher was the same for all records in a collection. An example of a collection with the same publisher was the Annual Budget Reports for the Texas A&M University System units. In this case, the publisher for the entire collection is the Texas A&M University System. Catalogers also mapped specific subfield data to the DC element that appeared to be the best fit. An example of this is the 260 subfield c, which was mapped to dc:date.created. The command line for mapping the 260 subfield c to dc:date.created looks like this:


If the collection had a series statement, the command for mapping that statement to dc:relation.isPartOfSeries element was added.


Every collection profile enumerated each specific metadata element necessary to describe the collection, and in essence, it served as a personal guide. Some collections benefited from having the 100 field mapped to dc:creator while the 7XX field was mapped to dc:contributor. In some cases that was not a good choice because the authors determined that individual authors would all be considered creators; in those cases the 100 field and all 7XX fields were mapped to dc:creator.

For some collections, the authors experimented with mapping the MARC 650 subfield z, the geographic subdivision, to dc:coverage:


This could not be applied to all collection sets, but for some, such as the Annual Budget Reports set, it was possible. Not all parts of every MARC record in a collection were mapped to the full DC schema in this experiment. For example, the linking field 776 (additional physical form) and other 7XX fields were not mapped to the dc:relation element.

The process of transforming MARCXML to DC records initially was a two-step process. The first step enhanced the MARCXML file with additional fields using one stylesheet and the next step transformed the enhanced MARCXML into DC using another stylesheet. Over the course of the project, the authors learned that they could combine these separate steps into one process. They created a new stylesheet that provided both enhancement and transformation of the original MARCXML file (see appendix C). This stylesheet could be modified according to the needs of each collection.

The issue with crosswalking metadata from a schema with high granularity such as MARC to a less granular schema like DC is that the loss of bibliographic content and context is unavoidable. Reese notes that metadata of lower granularity cannot easily be moved to schemas with higher granularity because content and context cannot be manufactured if they are not present in the original record.38 For this reason, the authors added fields to the MARCXML file that are equivalent across the whole collection by applying a transformation scenario through XSLT. Examples of added fields are the language field, the MARC 590 field with value “text” mapped to dc:type.material, or the MARC 591 field with a value “reformatted digital” mapped to datafield dc:format.digitalOrigin. This solution may not be perfect, but it was accomplished easily by using a stylesheet customized for each collection set. Careful analysis of content before the construction of a stylesheet for a collection enabled identification of fields that would be appropriate for all records. The number of addition global fields across a collection varied; up to ten were added for some collection sets.

The technical staff from the Digital Initiatives Department confirmed that the DC Resource Description Framework (DC-RDF) format is acceptable for ingestion of records into DSpace, Texas A&M University Libraries’ institutional repository software. Initial experiments described in this paper proved to be successful, leading to inclusion of the Cataloging Department in metadata creation for the institutional repository.

One cataloger adjusted the stylesheets available on the LC’s Network Development and MARC Standard Office site. The customization of the stylesheets for this experiment took five hours. Using this stylesheet, one thousand records were transformed from MARCXML format to DC XML format in less than a minute.


Discussion

Considering the fact that this work is duplicative within the institution and across the profession and within libraries at large, utilizing XSLT and similar programing languages plus tools that help with automating much of the processes is more than welcome. Automating these processes and the time saved by doing so cannot compare with the manual creation of either electronic equivalent of print records or equivalents in different metadata schema. As demonstrated in this paper, the authors created thousands of records within minutes using stylesheets that required five to ten hours to develop. The creation of a new basic record by cloning an existing record within an ILS can take several minutes, and using stylesheets to create thousands records can result in substantial savings in staff time. It should be noted that the authors are now re-using and adapting the same stylesheets, and therefore do not need to spend time creating new stylesheets each time. Simple modifications and adjustments to the existing ones take an insignificant amount of time.


Conclusion

Catalogers have the potential to undertake metadata projects by active participation in the transformation of MARCXML file format into DC XML or other metadata file formats. The authors, all catalogers, demonstrated that enhancement to MARCXML is possible with XSLT, and that creation of a MARCXML file format for electronic bibliographic records from print bibliographic records can be easily accomplished using XSLT stylesheets.

The authors also demonstrated that a crosswalk of records to another metadata schema, DC in this case, is simplified with XSLT. Catalogers’ knowledge and understanding of metadata trends and various schemas should be utilized in the transformation and repurposing of existing metadata stored in MARC records. Library managers should encourage catalogers to learn programming languages and how to use free, open-source tools such as MarcEdit. They should also encourage similar projects that could reduce expensive data entry by transforming and reusing existing metadata. Essential skills for these transformations are an awareness of the importance of the use of established standards and of consistency and precision in data entry. Catalogers already possess those skills. With the inclusion of catalogers in metadata projects for institutional repositories, library administrators utilize valuable partners with requisite skill sets.


References
1. Jin Ma,  "“Metadata in ARL Libraries: A Survey of Metadata Practices,”,"  Journal of Library Metadata  (2009)   9, no. 1–2:  1–14,  2013
2. Jung-ran Park, Caimei Lu,  and Linda Marion,  "“Cataloging Professionals in the Digital Environment: A Content Analysis of Job Descriptions,”,"  Journal of the American Society for Information Science & Technology  (2009)   60, no. 4:  844–57,  dx.doi.org/10.1002/asi.21007
3. Ingrid Hsieh-Yee,  "“Educating Cataloging Professionals in a Changing Information Environment,”,"  Journal of Education for Library & Information Science  (2008)   49, no. 2:  93–106,  accessed May 3, 2013, www.jstor.org/stable/40323778
4. Karen Calhoun,  "“Being a Librarian: Metadata and Metadata Specialists in the Twenty-First Century,”,"  Library Hi Tech  (2007)   25, no. 2:  174–87,  dx.doi.org/10.1108/07378830710754947
5. Terry Reese,  "“Automated Metadata Harvesting: Low-Barrier MARC Record Generation from OAI-PMH Repository Stores Using MarcEdit,”,"  Library Resources & Technical Services  (2009)   53, no. 2:  121–34,  dx.doi.org/10.5860/lrts.53n2.121
6. Ibid
7. Ibid
8. Yuji Tosaka,  "“Analyzing Library Metadata for Web-Based Metadata Reuse Services: A Case-Study Examination of WorldCat.org and RefWorks,”,"  Journal of Library Metadata  (2010)   10, no. 4:  257–75,  dx.doi.org/10.1080/19386389.2010.524864
9. Ibid
10. Ibid
11. Myung-Ja Han,  "“Creating Metadata for Digitized Books: Implementing XML and OAI-PMH in Cataloging Workflow,”,"  Journal of Library Metadata  (2011)   11, no. 1:  19–32,  dx.doi.org/10.1080/19386389.2011.545001
12. Mary S.. Woodley et al.,  “Crosswalks, Metadata Harvesting, Federated Searching, Metasearching: Using Metadata to Connect Users and Information,” in Introduction to Metadata, online edition, version 3.0 (Los Angeles: J. Paul Getty Trust, 2008), accessed July, 2013, http://scholarworks.csun.edu/bitstream/handle/10211.2/2001/WoodleyMary200803.pdf?sequence=1
13. Margaret St. Pierre and William P. LaPlantJr,  “Issues in Crosswalking Content Metadata Standards,” (white paper, National Information Standards Organization (NISO), Bethesda, MD, 1998), accessed May 3, 2013, www.niso.org/publications/white_papers/crosswalk
14. Woodley, “Crosswalks.”
15. Gordana Rudic and Dusan Surla,  "“Conversion of Bibliographic Records to MARC 21 Format,”,"  Electronic Library  (2009)   27, no. 6:  950–67,  dx.doi.org/10.1108/02640470911004057
16. Naomi Dushay and Diane I. Hillman,  “Analyzing Metadata for Effective Use and Re-use,”in Proceedings of the 2003 International Conference on Dublin Core and Metadata Applications: Supporting Communities of Discourse and Practice—Metadata Research & Applications (n.p.: Dublin Core Metadata Initative, 2003) : 1–10, accessed July 30, 2013, http://dl.acm.org/citation.cfm?id=1383296.1383318
17. Woodley, “Crosswalks.”
18. St. Pierre and LaPlant, “Issues in Crosswalking Content Metadata Standards.”
19. Carol Jean Godby, Jeffrey A. Young, and Eric Childress, “A Repository of Metadata Crosswalks,” D-Lib Magazine 10, no. 12 (2004), accessed May 3, 2013, www.dlib.org/dlib/december04/godby/12godby.html
20. Ibid
21. Adam Chandler and Elaine L. Westbrooks,  "“Distributing Non-MARC Metadata: The CUGIR Metadata Sharing Project,”,"  Library Collections, Acquisitions & Technical Services  (2002)   26, no. 3:  207.dx.doi.og/10.1016/S14649055(02)00247-6
22. Judith Ahronheim and Marko Lynn,  "“Exploding Out of the MARC Box: Building New Roles for Cataloging Departments,”,"  Cataloging & Classification Quarterly  (2000)   30, no. 2–3:  216–25.
23. Woodley, “Crosswalks.”
24. Calhoun, “Being a Librarian,” 178
25. Bruce Chr. Johnson, “XML and MARC: Which is ‘Right’?” Cataloging & Classification Quarterly 32, no. 1 (2001): 81–90, dx.doi.org/10.1300/J104v32n01_07
26. Ibid
27. Corey Keith,  "“Using XSLT to Manipulate MARC Metadata,”,"  Library Hi Tech  (2004)   22, no. 2:  122–30.
28. Jeni Tennison,  Beginning XSLT 2.0: From Novice to Professional (Berkeley, CA : Apress, 2005), 50
29. Ibid
30. Ibid
31. W3C, “XML Path Language (XPath) 2.0 (Second Edition),” accessed April 11, 2014, www.w3.org/TR/xpath
32. Library of Congress, Network Development and MARC Standards Office, “MARCXML,” accessed July 30, 2013, www.loc.gov/standards/marcxml
33. Keith, “Using XSLT to Manipulate MARC Metadata.”
34. Ibid
35. Karen Coyle, Journal of Academic Librarianship  (2005)   31, no. 2:  160–63,  “Understanding Metadata and Its Purpose,”
36. Library of Congress, "Network Development and MARC Standards Office, “MARCXML.”. "
37. Diane Hillmann,  “Using Dublin Core—The Elements,” Dublin Core Metadata Initiative, November 2005, accessed August, 3, 2013, http://dublincore.org/documents/usageguide/elements.shtml
38. Terry Reese,  "“Automated Metadata Harvesting.”. "

Figures

Figure 1

MARC21XML to MARC Conversion



Figure 2

MarcBreaker Function



Figure 3

MarcMaker Function



Figure 4

Marc to MARC21XML Function
















Tables
Table 1

Sample Metadata Profile


MARC field Dublin Core field
100, 110, 111 dc.creator
245 dc.title
260 subfield c dc.date.created
260 subfield b dc.publisher
650 _4 (example: Major Mathematics) dc.description
500 dc.description
520 dc.description.abstract
Added MARC field 546 dc.language.iso
600 dc.subject.lcsh
650 _4 (example: Major Mathematics) dc.description
830 dc:relation.isPartOfSeries
In some cases added field (594) with value: “Texas A&M University” dc.publisher
Added field (590) with value: “text” dc.type.material
Added field (595) with value: “reformatted digital” dc.format.digitalOrigin
Added field (596) with value: “electronic” dc.format.medium


Article Categories:
  • Library and Information Science
    • NOTES ON OPERATIONS

Refbacks

  • There are currently no refbacks.


ALA Privacy Policy

© 2024 Core