An Integrated Approach to Metadata Interoperability: Construction of a Conceptual Structure between MARC and FRBR | |
Seungmin Lee, Elin K. Jacob | |
Seungmin Lee received his doctorate in library and information science from Indiana University, Bloomington, Indiana; seungmin@umail.iu.edu | |
Elin K. Jacob is Associate Professor of Library and Information Science, and Director, School of Library and Information Science Doctoral Program, Indiana University, Bloomington, Indiana; ejacob@indian.edu | |
This research was presented in a poster session at the American Library Association Annual Conference, Washington, D.C, June 21–27, 2007. The title was “Construction of a Conceptual Structure as a Mediator between MARC and FRBR.” | |
Abstract | Machine-Readable Cataloging (MARC) is currently the most broadly used bibliographic standard for encoding and exchanging bibliographic data. However, MARC may not fully support representation of the dynamic nature and semantics of digital resources because of its rigid and single-layered linear structure. The Functional Requirements for Bibliographic Records (FRBR) model, which is designed to overcome the problems of MARC, does not provide sufficient data elements and adopts a predetermined hierarchy. A flexible structure for bibliographic data with detailed data elements is needed. Integrating MARC format with the hierarchical structure of FRBR is one approach to meet this need. The purpose of this research is to propose an approach that can facilitate interoperability between MARC and FRBR by providing a conceptual structure that can function as a mediator between MARC data elements and FRBR attributes. |
With the increase of information resources in diverse media, the library community has relied on tools such as Machine-Readable Cataloging (MARC) to manage and organize these resources. MARC was originally developed as a machine-readable structure for encoding and exchanging bibliographic data that would provide a basis for cooperative cataloging systems and allow libraries to organize and store bibliographic data in a consistent and standardized way.1 MARC has been the accepted bibliographic standard for more than forty years, and many other institutions that work with information resources (e.g., archives and museums) also have adopted the MARC for managing their resources.
Although MARC is suitable for more traditional resources, such as books and printed materials, it may not be the most appropriate tool for describing new forms of resources, such as computer files or web resources that are accessed remotely. Because of its inherently rigid structure, MARC is limited in its ability to describe digital resources and cannot adequately represent complex semantics. MARC's ability to represent relationships between bibliographic entities with multilayered characteristics also is problematic because of its linearity and its flat, single-layered structure.
To overcome these problems and to cope with the complexity of new types of resources, the International Federation of Library Associations and Institutions (IFLA) proposed the Functional Requirements for Bibliographic Records (FRBR) model in 1998.2 FRBR is intended to provide a framework for restructuring catalog databases to reflect the conceptual nature of resources. It uses an entity-relationship model instead of the flat record model underlying current cataloging standards and it focuses on the organization of attributes (or data elements) to provide for multiple relationships between bibliographic entities.
Although FRBR is intended to support representation of the multilayered characteristics of resources, it does have several weaknesses as a bibliographic standard. For example, it does not provide sufficient attributes to fully describe bibliographic entities. Because its strict hierarchical structure prescribes explicit relationships between groups of entities and attributes, the structure of FRBR might be too rigid to support the flexibility necessary for describing dynamic resources.
The weaknesses of both of these approaches have presented serious obstacles for managing resources and have underscored the need for a more flexible structure for bibliographic data that can handle the increasing variety of resource media, represent the relationships between bibliographic entities, and describe the multilayered characteristics of digital resources. If MARC format could be implemented in a hierarchical structure capable of describing multilayered characteristics, it could address current needs and thereby improve the efficiency of resource retrieval. The possibility of integrating MARC with the conceptual hierarchy of FRBR offers one possible approach to the dilemma of representing diverse digital resources.
This paper presents an alternative approach to bibliographic metadata that would facilitate interoperability between MARC and FRBR by using a conceptual structure that functions as a mediator between MARC elements and FRBR attributes. This conceptual structure is not intended to describe resources. Rather, it provides a set of core bibliographic elements that can connect MARC elements to related FRBR attributes and vice versa. By applying the conceptual structure described here, the ability of MARC to fully represent bibliographic entities and of FRBR to utilize relationships between those entities could be maximized, thereby minimizing the weaknesses of both MARC and FRBR. In addition, subsets of the elements in this conceptual structure could be used to describe resources in specific collections.
The library community has made various efforts to address the problem of using MARC and FRBR together and adopted several approaches to interoperability between these different structures. Delsey mapped MARC21 elements to FRBR attributes to achieve interoperability.3 Aalberg refined the FRBR attributes and created mapping tables to match FRBR attributes to MARC elements.4 Both focused on similarities between the two sets of descriptive elements. However, such efforts have indicated that many FRBR attributes cannot be mapped directly to MARC and vice versa. More importantly, both Delsey and Aalberg attempted to map between MARC elements and FRBR attributes without considering either the structural differences in their architectures or the procedural differences that exist between the two systems. MARC's single-layered structure and FRBR's multilayered hierarchy affect the semantics of their components differently: a component in either scheme might have a specific meaning that influences or is influenced by the relational structure of the standard. The weaknesses of approaches such as those of Delsey and Aalberg are rooted in the mapping method they have used. According to Kurth, Ruddy, and Rupp, mapping is the process of establishing relationships between semantically equivalent elements in different structures.5 Thus mapping refers to the process of associating elements from one set with elements from another set, thereby providing conceptual connections between data elements in two or more bibliographic structures. However, establishment of relationships at the structural level is necessary to achieve full interoperability between two or more schemes. Simple mapping is not sufficient because it identifies relationships only at the level of the data element.
Riva adopted a detailed bidirectional mapping between FRBR relationships and MARC linking entry fields.6 In this mapping, Riva used Tillett's taxonomy of bibliographic relationships, which has seven major classes: equivalence, derivative, descriptive, whole-part, accompanying, sequential, and shared characteristic. Through these divisions, Riva mapped each MARC linking entry field to one or more entries in the FRBR relationship. The mapping focused on differences in scope and level of detail represented in the categorizations of bibliographic relationships, and it provided clear relationships between MARC21 fields and FRBR relationships. Although Riva's mapping allowed direct links between closely related works, the scope was limited to MARC 76X-78X fields and did not provide comprehensive mapping because of differences in granularity between MARC21 and FRBR.
Monch and Aalberg proposed a prototype system for the automatic extraction of FRBR model entities from MARC records.7 This extraction required mapping between MARC fields and FRBR entities. For the mapping, they used the framework BIBSYS, which is based on the MARC format, to convert MARC fields and subfields to FRBR entities given the information stored in a set of existing MARC records. In mapping MARC fields to FRBR entities, Monch and Aalberg introduced what they called an attribute layer to extract consistent information from the records. Using the attribute layer, information from MARC records could be mapped to a set of selected FRBR entities. Although this approach integrated MARC fields and FRBR entities based on their generic meanings, it required a new framework to merge one scheme with another. Even though this approach eliminated heterogeneity between MARC and FRBR through construction of a new framework, it could not retain the entire structure or characteristics of either MARC or FRBR.
Each of these approaches sought to eliminate heterogeneity between MARC and FRBR. However, the weaknesses of these approaches are obstacles to achieving full interoperability. One main reason for their failures is that these approaches focus on the similarity of names to establish relationships between elements in the two structures, frequently resulting in the identification of ambiguous, misleading, or inaccurate relationships that present significant obstacles to achieving interoperability between MARC and FRBR.
Because MARC and FRBR are bibliographic systems with very different structures, the authors sought to achieve interoperability between them and to fully capitalize on the advantages of each by constructing a conceptual framework that would be flexible enough to cope with the complexity of digital resources while functioning as a mediator between the two systems. To construct this conceptual framework, the authors analyzed MARC elements and FRBR entities and attributes in light of their respective structures. They then categorized elements and attributes according to their intended application or use. Only those elements that described resources (MARC) or indicated bibliographic relationships (FRBR) were analyzed, and elements that had the same or similar meanings in each structure were categorized. This provided a basis for mapping elements from each structure; MARC elements in one category were mapped only with FRBR attributes and entities in the equivalent category.
While categorizing elements, the authors identified four types of matching and used them to group elements: exact matching, analogous matching, partial matching, and nonmatching. On the basis of analysis of semantic relationships between categories, the authors identified four levels of nesting for the proposed structure: main class, class, subclass, and instance. Finally, the authors merged the mapped elements to form a new structure that was based on the semantic relationships between elements. This approach capitalizes on the strengths of mapping and merging to achieve interoperability between two heterogeneous structures.
MARC21 Concise Format for Bibliographic Data defines three main components of a MARC record: the leader, the directory, and the variable fields.8 These bibliographic components are enumerated in a predetermined structure that stipulates a prescribed ordering of elements. MARC format uses a set of well-defined tags, indicators, delimiters, and subfield codes in association with its prescribed ordering to describe and store bibliographic data (see table 1).
MARC is an analytical system with a linear structure that can fully describe bibliographic entities through the application of almost 2,000 descriptive elements. However, MARC simply enumerates these elements in a flat, single-layered format prescribed by its linear structure. Unfortunately, this structural rigidity cannot fully support the representation of resources that have multilayered bibliographic relationships. Furthermore, description of new types of digital resources is often problematic because MARC was originally developed for traditional print materials.
One limitation inherent in MARC can be traced to its original purpose as a means for storing bibliographic information in a standard format that could be manipulated by machines. In line with the limited access points offered by traditional card catalogs, MARC adopted the concept of main entry represented by the 1XX field tag. However, in an automated environment, all access points are equal except where they are privileged by automated indexing. Furthermore, retaining the use of main entry access separates related or similar elements and results in the potential duplication of information in a record. For example, the author responsible for a work is designated as the main entry and represented in a 1XX field (e.g., 100 $a personal name). However, the same information about the author is also represented in the statement of responsibility (e.g., 245 $c statement of responsibility). In the case of multiple authors, the first author (100 $a) and other authors (700 $a) are separated in the structure, even though all authors share responsibility for the work. This generates a complex structure that separates closely related data and mandates the duplication of information.
In contrast, FRBR defines logical and semantic relationships between bibliographic objects in terms of an entity-relationship model: where MARC uses a linear and flat bibliographic structure, FRBR represents a catalog record as a set of relationships between multiple entities. FRBR identifies three types of entities that are relevant to bibliographic objects (see table 2): Group 1 (Work, Expression, Manifestation, and Item); Group 2 (Person and Corporate body); and Group 3 (Concept, Object, Event, and Place). These groups are defined in Functional Requirements for Bibliographic Records: Final Report:
Group 1 comprises the products of intellectual or artistic endeavour that are named or described in bibliographic records: work, expression, manifestation, and item. [Group 2] comprises those entities responsible for the intellectual or artistic content, the physical production and dissemination, or the custodianship of such products: person and corporate body. [Group 3] comprises an additional set of entities that serve as the subjects of intellectual or artistic endeavour: concept, object, event, and place.9
The FRBR model focuses on organizing data about entities. It provides for multiple relationships between bibliographic entities by adopting a hierarchical structure that can describe bibliographic relationships and handle the multilayered characteristics of resources. FRBR defines ninety-seven attributes in terms of the characteristics of an entity rather than as specific attributes. The description of a FRBR entity can be expanded by using attributes that allow users to formulate queries about that particular entity. In contrast, MARC represents only manifestation-level and item-level information about bibliographic entities. While MARC does include work-level and expression-level elements that correspond to FRBR attributes, this information is typically placed in fields related to authority records or uniform titles. Thus, although MARC can combine work, expression, manifestation, and item information in a single bibliographic record, it cannot express explicit relationships between these entities.
FRBR enhances the retrieval of digital resources because it contains attributes that are specific to digital resources, such as system requirements, file characteristics, mode of access, and access address, information that MARC does not clearly provide. However, even though the FRBR model can support representation of hierarchical relationships, the fact that FRBR's hierarchy is predetermined means that the relationships between attributes are rigid and do not offer the flexibility necessary to describe the dynamic nature of digital resources.
Traditionally, cataloging consists of three core elements: author, title, and subject. Although evolution of the cataloging environment has added additional elements (e.g., physical description), the core elements are critical for describing resources. Theoretically, every resource has an author who is responsible for the intellectual content of the resource, a title that represents the resource, and a subject or topic that the resource addresses. In addition, the content is carried by a specifiable medium, be it the printed page or a digital website. A bibliographic record is not “complete” without the author, title, subject, and description elements. Accordingly, the authors began construction of the conceptual framework by extracting core bibliographic elements from both MARC and FRBR. After extraction, they categorized these elements according to their representational use to provide a basis for linking between MARC elements and FRBR entities and attributes.
The authors categorized MARC elements according to the referent of each subfield. Because actual definitions of the content of variable data fields were not considered, the authors were able to group elements that have the same or similar semantics but are frequently separated in the MARC format. By categorizing each element on the basis of its referent(s) or semantics, duplication in the MARC format could be identified and eliminated. The authors then grouped MARC elements into seven categories: Author, Title, Subject, Publication, Description, Identifier, and Format (table 3). Elements in Format were further subdivided by specific types of resources (table 4) because the elements describing any one format are often quite different from those of other formats, and identifying similarity across subfields can be difficult, if not impossible.
In similar fashion, the authors categorized FRBR entities and attributes according to their referent(s) without considering predetermined FRBR groupings. As with the MARC elements, the authors grouped FRBR entities into seven categories: Author, Title, Subject, Description, Identifier, Publication, and Format (table 5). They also subdivided Format by specific types of resources (table 6).
Previous efforts to identify relationships between MARC elements and FRBR entities and attributes usually applied a one-to-one mapping process. One-to-one mapping cannot accurately capture relationships between MARC and FRBR because it focuses on superficial semantics and functionalities and fails to consider similarities or differences in structure across schemes. In contrast, the mapping strategy applied here addresses the weaknesses of previous efforts by considering structural similarities among related elements in each system. The degree or strength of matching between any MARC element and FRBR attribute is an important indicator of shared semantic content and provides a basis for identifying core classes in the proposed conceptual structure. By comparing elements and attributes in corresponding categories, the authors established detailed relationships between MARC elements and FRBR attributes on the basis of an analysis of the strength of matching.
The seven categories contain elements that are either identical or similar in both structures. The authors mapped between MARC and FRBR by comparing the elements in each of the common categories (e.g., elements in the MARC category Author were compared with attributes in the FRBR category Author) to identify the strength of the match between any two elements. In this process of comparison, four levels of matching were applied: exact matching, analogous matching, partial matching, and nonmatching.
“Exact matching” indicates that a MARC element and an FRBR attribute have the same referent. For example, “100 $a Personal name” in MARC and the attribute “name of person” in the FRBR entity “Person” have the same referent (i.e., the person or people responsible for creation or production of a resource). The “700 $a Personal name” in MARC also shares the same meaning with “name of person” in the FRBR entity “Person.” MARC's “245 $a Title statement” demonstrates exact matching with the attribute “title of work” in the FRBR entity “Work.” The attributes “title of expression” and “title of manifestation” in FRBR also can be matched with “245 $a Title statement” in MARC because each of these attributes indicates the title of a resource.
“Analogous matching” indicates that an element and an attribute have similar, but not necessarily the same, referents. For example, the 050, 080, and 082 fields in MARC indicate a classification number assigned to a certain work (050 $a LCC, 080 $a UDC, and 082 $a DDC classification number). The FRBR attribute manifestation identifier can include a URL or Digital Object Identifier (DOI) as well as the classification number of a resource. Obviously, the use of MARC fields to indicate the classification label of a resource is similar to but not necessarily an exact match with the use of manifestation identifier in FRBR. Thus, even though the semantic referent of the MARC elements 050, 080, and 082 is more specific than FRBR's manifestation identifier, they are analogous because each is similar to (or a kind of) manifestation identifier.
“Partial matching” indicates that the element(s) and attribute(s) share a referent in part but are not analogous in the whole. For example, “651 $a Geographic name” in MARC represents the name of a specific place or region, but no equivalent attribute is present in FRBR. However, the FRBR entity “Concept” includes the attribute term for the concept, which can be used to represent a place or region as a keyword or topic. The MARC element and the FRBR attribute are not analogous, but they might point to a similar referent and thus indicate a partial matching. Another example of partial matching is the “254 $a Musical presentation” statement in MARC and the attributes “medium of performance,” “numeric designation,” and “key” in the “Work” entity in FRBR. MARC uses the 254 field to represent a musical work, but the range of the field is quite broad and can include any statement related to musical works. Because the attributes “medium of performance,” “numeric designation,” and “key” in FRBR can be parts of the 254 field in MARC, they provide another example of partial matching.
“Non-matching” indicates that elements and attributes do not share a meaning, even though they might be placed in the same categories. For example, FRBR provides the attribute “colour” in the entity “Manifestation,” but no corresponding field or subfield is present in MARC. Other examples are FRBR attributes that can be used to describe electronic resources, sound recordings, and microform, including the attributes polarity, playing speed, and kind of sound in the entity “Manifestation,” but no element in MARC is designated for representation of these properties even though a 5XX field might contain similar or equivalent information.
Analysis of the strength of matching between MARC elements and FRBR attributes revealed that the core categories Author, Title, Subject, Description, Identifier, Publication, and Format contained exact, analogous, and partial matching elements from both systems. Only those elements identified with core categories in both MARC and FRBR were considered in subsequent mappings; nonmatching elements were excluded from further consideration. Appendix A shows the mapping for these seven categories.
In the category Author, MARC has seven fields with related delimiters and FRBR has nine attributes in two entities. The elements in this category are all related to a person or corporate body responsible for a work and can be divided into four groups according to their referents: Person, Corporate Body, Meeting, and Responsibility.
The authors mapped MARC elements in each group with the corresponding FRBR attributes. In the group Person, both MARC and FRBR have elements that describe a person. Mapping shows that, with the exception of miscellaneous information about a person, these elements are exact matches. The group Corporate Body contains elements with both exact and partial matching. In both of these groups, detailed and commonly used descriptions (e.g., name, date, place, etc.) were mapped as exact matches. For the group Meeting, FRBR does not provide any specific attributes, but it does incorporate attributes related to a meeting into Corporate Body. In contrast, MARC provides the 111 (i.e., meeting name as main entry) and 711 (i.e., meeting name as added entry) fields. Thus no exact matches are found in the group Meeting.
Although the statement of responsibility in the group Responsibility demonstrates exact matches with elements in the groups Person, Corporate Body, and Meeting, this element could not be included in any group because the strength of matching is actually determined by the context of each individual record. Therefore Responsibility is retained as a separate group.
The category Title includes elements related to the title of a work or series. MARC provides three fields for the title of a work (245, 246, and 505), three fields for a uniform title (130, 240, and 730), and three fields for a series statement (440, 490, and 740). FRBR has four attributes (title of work, title of expression, title of the manifestation, and series statement) in the entities Work, Expression, and Manifestation. The elements in Title can be divided into four groups: Title Statement, Title Proper, Uniform Title, and Series Statement.
For the title of a work, FRBR provides only one attribute per entity (title of work, title of expression, or title of the manifestation), but MARC provides detailed elements related to a title (Title statement and Title proper). Among these elements, “245 $a Title statement” is an exact match with all FRBR title attributes. Other elements show analogous and partial matching relationships. For a series statement, the FRBR entity Manifestation has only the attribute series statement; the two MARC fields 440 and 490 are both related to series statements and demonstrate exact matches with FRBR's series statement. The remaining 740 MARC field is a subset of a series statement and therefore demonstrates only a partial match with FRBR's series statement.
The range of the category Subject is relatively broad because many concepts can represent the subject of a resource, including topic, classification number, geographic name, chronological term, person, etc. To deal with the different types of subject, MARC provides the 6XX fields, which are divided into several subject areas. FRBR does not provide any attribute specific to subject; but the entities in group 3 (Concept, Object, Event, and Place) are capable, in part, of representing subject information.
The elements in the category Subject can be categorized into two groups: classification number and keyword. A classification number is treated as a representation of the subject of a work, and MARC provides the three fields 050, 080, and 082. FRBR does not provide a similar attribute for the classification of a work, but it does offer the attribute manifestation identifier in the entity Manifestation. Although a classification number can function as an identifier for a work as well as an indicator of its subject, an identifier is not necessarily a subject representation. Therefore the strength of the relationship between the MARC element and the FRBR attribute can only be represented as an analogous match.
The group Keyword is divided into three subgroups: Creator, Form, and Topic. For the groups Keyword:Topic and Keyword:Form, FRBR has attributes that are both analogous and partial matches with MARC elements. For example, a uniform title in MARC's “630 $a uniform title” is a partial match with FRBR's attribute “form of work” in the entity “Work.” A uniform title in the MARC 630 field is used as a subject added entry and is therefore grouped in the category “Subject.” In addition, the 630 field can be a representation of genre or form of work because some types of uniform titles (e.g., the names of newspapers, journals, motion pictures, and radio and television programs) are placed in the 630 field. Therefore a uniform title in the MARC 630 field is not mapped with the FRBR attribute “title of a work,” but as a partial match with the attribute “form of work.” For Keyword:Creator, a person, a corporate body, or a meeting can be a subject of a work; MARC provides the fields 600, 610, and 611 to describe these subject types, but FRBR has no corresponding attributes.
The category Identifier includes elements that represent a unique marker for a work. MARC offers the fields 020 and 022; FRBR has the attribute “manifestation identifier” in the entity “Manifestation,” which can cover any type of identifier (e.g., numeric code, textual code, ISBN, or inventory number). Although the range of the FRBR attribute is broader than the MARC 020 and 022 fields, the referent of the MARC fields is similar to the FRBR attribute. Therefore the MARC fields can be considered exact matches with FRBR's “manifestation identifier.”
Elements related to the description of a resource, such as edition, physical medium, and notes, comprise the category Description. MARC provides detailed elements, including the fields 250 Edition statement, the 300 Physical description, and 340 Physical medium. FRBR also has descriptive attributes, such as edition/issue designation and physical medium. The elements in the Description category can be categorized into three groups: Edition, Representation, and Summary.
For the group Edition, both MARC and FRBR have elements that represent the edition of a resource. MARC uses the 250 field, and FRBR has the attribute edition/issue designation in the entity Manifestation. FRBR also provides the other distinguishings characteristic in the entity Expression, which can describe revisions and versions as well as specific editions of a work. These are analogous matches with the MARC 250 field because the terms “revision” and “version” have meanings that are similar to, but not identical with, the term “edition.”
The group Representation contains elements that describe specific features of a resource and demonstrate exact or partial matches between MARC and FRBR. For the group Summary, both MARC and FRBR contain common elements that show exact or partial matches across the two standards.
The category Publication includes elements related to publication or distribution and publisher or distributor. FRBR offers the attributes “place of publication/distribution,” “publisher/distributor,” and “date of publication/distribution” in the entity “Manifestation” and the attribute “date of expression” in the entity “Expression.” Although MARC has only the single field, 260, for information about publication and distribution, detailed aspects of publication can be specified through the use of three delimiters: $a place of publication/distribution, $b name of publisher/distributor, and $c date of publication/distribution. Thus groups in the category Publication are based on the MARC delimiters for the 260 field (Date, Place, and Publisher), which demonstrate exact matches with corresponding FRBR attributes.
The category Format differs from the other six categories because it contains nine unique groups: Serial, Musical Work, Cartographic Work, Computer File, Image, Microform/Visual Projection, Electronic Resource, Sound Recording, and Other Formats. Each group represents a different format in which a resource can occur. Because each format has different characteristics, the elements that describe a format are often very different from those describing other formats. For example, the MARC 342 field for Geospatial reference data is not applied in any format except Cartographic Work, which leads to difficulties when attempting to extend or integrate those elements to represent new types of resources.
Among the nine Format groups, Image, Microform/Visual Projection, Electronic Resource, and Sound Recording demonstrate nonmatching relationships because MARC does not include elements for describing the unique characteristics of these formats. Although MARC does provide the 5XX field to represent features of resources in these formats, the content of these fields is limited to general notes and a description of the medium. For the most part, however, mappings in other groups show partial matches. This may be a result of the original purpose of each system. MARC was developed to represent printed materials, while the FRBR model was intended to address the representation of resources in multiple formats.
Following categorization and the mapping of MARC elements with FRBR attributes, the authors constructed a conceptual structure that would function as a mediator between MARC and FRBR. This structure, which consists of main classes, classes, subclasses, and instances, is made up of abstract elements and is not intended to address actual descriptive elements in MARC, FRBR, or any other bibliographic standard.
Main classes are groups of identical or similar elements and attributes from both MARC and FRBR and correspond to the seven core categories derived during the categorization process. In both the MARC and FRBR systems, the elements and attributes in a main class are frequently distributed across the respective system. For example, both the 245 and 505 fields in MARC represent the title of a resource, but these fields are separated in the MARC system. FRBR separates attributes related to the title of a resource (title of work, title of expression, and title of the manifestation). In the conceptual structure, each main class provides a space in which these “distributed relatives” are brought together. In addition, any duplication of elements is eliminated by the grouping of identical or similar elements and attributes.
Among the types of match used in the mapping process, exact matches indicate that MARC and FRBR use the same elements to describe a particular aspect of a bibliographic entity. The element(s) and attribute(s) whose relationship is based on an exact match are used to form a class of data elements that connects MARC elements directly to FRBR attributes. Only those elements and attributes that share a comprehensive or identical referent were selected to form a class.
Each class is nested under its relevant main class (table 7). Analogous matches (those matches indicating that a similar but not identical meaning or referent is shared between a MARC element and a FRBR attribute) form subclasses. Although an analogous match does not indicate a comprehensive meaning encompassing the meaning of each subclass, it can demonstrate conceptual associations across a range of elements. Partial matches are identified as instances of classes and subclasses, with each instance representing the specific details of a bibliographic record. The full conceptual structure is provided in appendix B.
Main classes, classes, subclasses, and instances are nested hierarchically in the proposed conceptual structure, with the main class defining the semantic range of its subordinate classes, subclasses, and instances. A class brings together related elements under each main class and subclasses. Instances represent the more detailed aspects of a particular class and serve as the actual medium for connecting MARC elements and FRBR attributes within the conceptual structure.
The proposed conceptual structure is not intended for the description of specific resources but as a set of bibliographic data elements that can be linked directly both to MARC elements and to FRBR attributes. If an element in the proposed conceptual structure connects MARC elements with FRBR entities and attributes, MARC can be used for detailed descriptive elements and FRBR for the representation of bibliographic relationships. In this way, an element in the conceptual structure can be used to indicate both detailed descriptive elements and bibliographic relationships.
This structure also addresses some of the weaknesses in both MARC and FRBR. To overcome the lack of bibliographic relationships resulting from MARC's flat structure, MARC elements can be supplemented with FRBR's bibliographic relationships through connections established in the conceptual structure. In similar fashion, the lack of sufficient descriptive attributes in FRBR can be addressed by incorporation of related MARC elements.
The authors constructed a conceptual structure that can function as a mediator between the heterogeneous bibliographic systems MARC and FRBR. MARC is limited in describing information resources because of its rigid and single-layered linear structure. Although the MARC format is suitable for representing more traditional resources, such as books and print materials, weaknesses inherent in its structure prevent full representation of the complex nature and semantics of digital resources. In contrast, the FRBR model focuses on the organization of entities and attributes supporting multiple relationships between bibliographic entities. While FRBR can support representation of the multilayered characteristics of resources, it does not provide sufficient descriptive elements to fully represent bibliographic entities. Additionally, because the relationships between attributes in FRBR are predetermined, they might restrict the flexibility necessary to describe the dynamic nature of digital resources.
The conceptual structure proposed here serves as a mediator between MARC elements and FRBR attributes. In this conceptual structure, data elements are able to connect MARC elements and FRBR entities and attributes because the components of the conceptual structure were extracted directly from the core elements of both systems. Although the conceptual structure is not intended for the actual description of resources, it does provide a set of core bibliographic elements that, in association with an explicit structure of relationships, can make up for the weaknesses of both MARC and FRBR. More importantly, the elements of the MARC format can be used in association with bibliographic relationships in the FRBR model through the connections provided in the conceptual structure. In addition, systems that use MARC bibliographic records can be expected to demonstrate enhanced capability for information retrieval when combining, through the conceptual structure, the detailed description of MARC with the explicit relationships of FRBR.
References
1. | Monika Halina Szunejko, "“The Description of Internet Resources: A Consideration of the Relationship between MARC and Other Metadata Schemes,”," Technical Services Quarterly (2002) 3, no. 18: 1–10. |
2. | FLA Study Group on the Functional Requirements for Bibliographic Records, Functional Requirements for Bibliographic Records: Final Report, www.ifla.org/files/cataloguing/frbr/frbr_2008.pdf (accessed June 7, 2010); Jeffrey Beall, “Some Reservations about FRBR,” Library High Tech News 23, no. 2(2006): 15–16 |
3. | Tom Delsey, "The Logical Structure of the Anglo-American Cataloguing Rules—Part I"Drafted for the Joint Steering Committee for Revision of AACR, www.rda-jsc.org/docs/aacr.pdf (accessed June 7, 2010) |
4. | Trond Aalberg. “From MARC to FRBR: A Case Study in the Use of the FRBR Model on the BIBSYS Database,” (slideshow presentation, Satellite Meeting to the 71st World Library and Information Congress, “Bibliotheca Universalis—How to Organize Chaos!” Jarvenpa, Finland, Aug. 11–12, 2005), www.fla.fi/frbr05/aalberg2BIBSYSfrbrized.pdf (accessed July 18, 2010) |
5. | Martin Kurth, David Ruddy, and Nathan Rupp, "“Repurposing MARC Metadata: Using Digital Project Experience to Develop a Metadata Management Design,”," Library Hi Tech (2004) 22, no 2: 153–65. |
6. | Pat Riva, "“Mapping MARC21 Linking Entry Fields to FRBR and Tillett's Taxonomy of Bibliographic Relationships,”," Library Resources & Technical Services (2004) 48, no. 2: 130–43. |
7. | Christian Monch Trond and Aalberg, “Automatic Conversion from MARC to FRBR,” in Research and Advanced Technology for Digital Libraries: 7th European Conference, ECDL 2003, Trondheim, Norway, August 17–22, 2003 (2003): Proceedings Traugott Koch and Ingeborg T. Sølvberg, eds. (Berlin: Spring, 2003): 405–11 |
8. | Library of Congress, Network Development and MARC Standards Office, MARC21 Concise Format for Bibliographic Data: 1999 Edition, Update No. 1 (October 2001) through Update No. 11 (February 2010), www.loc.gov/marc/bibliographic/ecbdhome.html (accessed June 7, 2010) |
9. | IFLA Study Group, Functional Requirements for Bibliographic Records |
Tables
Components of Directory
Components of Directory | Description |
Field | Each bibliographic record is divided into fields. |
Tag | The name of each field is represented by a three-digit field tag that identifies the kind of data present in the field. |
Indicator | Two one-character positions follow each tag and provide further information for machine processing of the bibliographic data. |
Subfield | A field may include one or more data values, each of which is contained in a subfield. |
Subfield codes and delimiters | Each subfield is preceded by a delimiter and a subfield code. Delimiters and codes are used to identify separate elements of information within the field. |
Entities Comprising FRBR Groups
Group | Entities | Attributes |
Group 1 | Work | work title, form or genre, date, performance medium, intended audience, etc. |
Expression | expression title, form of the expression, language of the expression, type of score, scale of a map, etc. | |
Manifestation | manifestation title, publisher, date of publication, form of carrier, dimensions, manifestation identifier (e.g. ISBN), terms of availability, etc. | |
Item | location or call number, barcode, provenance, condition, access restrictions on an item, etc. | |
Group 2 | Person | names, dates, titles, other designations, etc. |
Corporate Body | name, number, place, date, other designations, etc. | |
Group 3 | Concept | term |
Object | term | |
Event | term | |
Place | term |
Categorization of MARC Elements
Category | MARC field | Delimiter | Description |
Author | 100 | $a | Personal name |
110 | $a | Corporate name | |
111 | $a | Meeting name | |
245 | $c | Statement of responsibility | |
700 | $a | Personal name | |
710 | $a | Corporate name | |
711 | $a | Meeting name | |
Title | 130 | $a | Uniform title |
240 | $a | Uniform title | |
245 | $a | Title statement | |
245 | $b | Remainder of title | |
440 | $a | Series statement | |
490 | $a | Series statement | |
505 | $t | Title | |
730 | $a | Uniform title | |
740 | $a | Uncontrolled title | |
Subject | 050 | $a | LCC classification number |
080 | $a | UDC classification number | |
082 | $a | DDC classification number | |
600 | all | Personal name | |
610 | all | Corporate name | |
611 | $a | Meeting name | |
630 | $a | Uniform title | |
648 | $a | Chronological term | |
650 | $a | Topical term | |
651 | $a | Geographic name | |
651 | $x | General subdivision | |
653 | $a | Uncontrolled index term | |
654 | $a | Focus term | |
Publication | 260 | $a | Place of publication |
260 | $b | Name of publisher | |
260 | $c | Date of publication | |
Identifier | 020 | $a | ISBN |
022 | $a | ISSN | |
Description | 250 | $a | Edition statement |
300 | $a | Extent | |
300 | $b | Physical details | |
300 | $c | Dimensions | |
500 | $a | General note | |
505 | $a | Formatted content notes | |
520 | $a | Summary | |
Format | Serial, musical work, cartographic work, Computer file, image, microform, Electronic resource, sound recording, etc. |
Format Elements in MARC Categorization
Subcategory | MARC field | Delimiter | Description |
Serial | 310 | $a | Current publication frequency |
321 | $a | Former publication frequency | |
362 | $a | Dates of publication | |
Image | 352 | $a | Digital graphic representation |
Cartographic work | 255 | $a | Cartographic mathematical data |
342 | $a | Geospatial reference data | |
343 | $a | Planar coordinate data | |
Computer file | 256 | $a | Computer file characteristics |
Musical work | 254 | $a | Musical presentation statement |
Electronic resource | 856 | all | Electronic location and access |
Categorization of FRBR Entities and Attributes
Category | Entity | Attribute |
Author | Person | name of person |
Person | dates of person | |
Person | title of person | |
Person | other designation | |
Corporate Body | name of the corporate body | |
Corporate Body | number associated with the corporate body | |
Corporate Body | place associated with the corporate body | |
Corporate Body | date associated with the corporate body | |
Corporate Body | other designation | |
Manifestation | statement of responsibility | |
Title | Work | title of work |
Expression | title of expression | |
Manifestation | title of manifestation | |
Manifestation | series statement | |
Identifier | Manifestation | manifestation identifier |
Publisher/Publication | Expression | date of expression |
Manifestation | place of publication/distribution | |
Manifestation | date of publication/distribution | |
Manifestation | publisher/distributor | |
Subject | Work | form of work |
Work | context for the work | |
Expression | context for the expression | |
Item | item identifier | |
Concept | term for the concept | |
Object | term for the object | |
Description | Expression | form of the expression |
Expression | critical response to the expression | |
Expression | summarization of content | |
Expression | other distinguishing characteristics | |
Manifestation | issue designation | |
Manifestation | extent of carrier | |
Format | serials, musical work, image, cartographic work, electronic resource, sound recording |
Format Attributes in FRBR Categorization
Subcategory | Entity | Attribute |
Serial | Expression | expected regularity of issue |
Expression | expected frequency of issue | |
Expression | sequencing pattern | |
Manifestation | numbering | |
Manifestation | publication status | |
Musical work | Work | medium of performance |
Work | numeric designation | |
Work | key | |
Expression | type of score | |
Expression | medium of performance | |
Image | Expression | recording technique |
Expression | special characteristic | |
Expression | technique | |
Manifestation | colour | |
Cartographic work | Work | coordinates |
Work | equinox | |
Expression | scale | |
Expression | projection | |
Expression | presentation technique | |
Expression | representation of relief | |
Expression | geodetic, grid, vertical measurement | |
Electronic | Manifestation | system requirements |
resource | Manifestation | file characteristics |
Manifestation | mode of access | |
Manifestation | access address | |
Sound recording | Manifestation | playing speed |
Manifestation | groove width | |
Manifestation | kind of cutting | |
Manifestation | tape configuration | |
Manifestation | kind of sound | |
Manifestation | special reproduction characteristic |
Main Classes and Classes of Proposed Conceptual Structure
Main class | Classes |
Author | Person, Corporate Body, Meeting |
Title | Title Statement, Series Statement |
Subject | Classification Number, Keyword |
Description | Edition, Summary, Representation |
Identifier | Identifier |
Publication | Publisher |
Format | Serials, Musical Work, Cartographic Work, Computer File, Image, Microform, Electronic Resource, Sound Recording, Other Formats |
Class | Group | MARC Field | Delimiter | Description | FRBR Entity | Attribute | Strength of Relationship |
Author | Person | 100 | $a | personal name | Person | name of person | exact |
700 | $a | personal name | |||||
100 | $c | title and other words | Person | title of person | exact | ||
700 | $c | title and other words | |||||
100 | $d | dates associated with the person | Person | dates of person | exact | ||
700 | $d | ||||||
dates associated with the person | |||||||
100 | $g | miscellaneous information | Person | other designation associated with the person | partial | ||
700 | $g | miscellaneous information | |||||
Corporate Body | 110 | $a | corporate name | Corporate Body | name of the corporate body | exact | |
710 | $a | corporate name | |||||
110 | $g | miscellaneous information | Corporate Body | number associated with the corporate body | partial | ||
710 | $g | miscellaneous information | |||||
110 | $g | miscellaneous information | Corporate Body | other designation associated with the corporate body | partial | ||
710 | $g | miscellaneous information | |||||
110 | $c | location | Corporate Body | place associated with the corporate body | exact | ||
710 | $c | location of meeting | |||||
110 | $f | date | Corporate Body | date associated with the corporate body | exact | ||
710 | $f | date | |||||
Meeting | 111 | $a | meeting name | Corporate Body | name of the corporate body | analogous | |
711 | $a | meeting name | |||||
111 | $g | miscellaneous information | Corporate Body | number associated with the corporate body | partial | ||
711 | $g | miscellaneous information | |||||
111 | $c | place | Corporate Body | place associated with the corporate body | analogous | ||
711 | $c | location of meeting | |||||
111 | $f | date of meeting | Corporate Body | date associated with the corporate body | analogous | ||
711 | $f | date of meeting | |||||
111 | $g | miscellaneous information | Corporate Body | other designation associated with the corporate body | partial | ||
711 | $g | miscellaneous information | |||||
Responsibility | 245 | $c | statement of responsibility | Manifestation | statement of responsibility | exact | |
Title | Title Statement | 245 | $a | title statement | Work | title of work | exact |
Expression | title of expression | exact | |||||
Manifestation | title of the manifestation | exact | |||||
505 | $t | title | Work | title of work | analogous | ||
Expression | title of expression | analogous | |||||
Manifestation | title of the manifestation | analogous | |||||
Title Proper | 246 | $a | title proper | Work | title of work | partial | |
246 | $b | remainder of title | |||||
Expression | title of expression | partial | |||||
Manifestation | title of the manifestation | partial | |||||
Uniform Title | 130 | $a | uniform title | Work | title of work | analogous | |
240 | $a | uniform title | |||||
730 | $a | uniform title | |||||
Series Statement | 440 | $a | series statement/added entry-title | Manifestation | series statement | exact | |
490 | $a | series statement | |||||
exact | |||||||
740 | $a | uncontrolled title | Manifestation | series statement | partial | ||
Subject | Classification Number | 050 | $a | LCC classification number | Manifestation | manifestation identifier | analogous |
080 | $a | UDC classification number | Manifestation | manifestation identifier | analogous | ||
082 | $a | DDC classification number | Manifestation | manifestation identifier | analogous | ||
Keyword Creator | 600 | all | personal name | non | |||
610 | all | corporate name | |||||
611 | $a | meeting name | |||||
Keyword: Form | 630 | $a | uniform title | Work | form of work | partial | |
655 | $a | genre/form | analogous | ||||
Keyword: Topic | 648 | $a | chronological term | Work | context for the work | partial | |
Expression | context for the expression | ||||||
partial | |||||||
650 | $a | topical term | Concept | term for the concept | analogous | ||
651 | $a | geographic name | partial | ||||
651 | $x | general subdivision | Work | context for the work | partial | ||
Expression | context for the expression | partial | |||||
651 | $y | chronological subdivision | Work | context for the work | partial | ||
653 | $a | uncontrolled | Expression | context for the expression | partial | ||
Object | term for the object | partial | |||||
654 | $a | focus term | Concept | term for the concept | analogous | ||
Identifier | Identifier | 020 | $a | ISBN | Manifestation | manifestation identifier | exact |
022 | $a | ISSN | |||||
Description | Edition | 250 | $a | edition statement | Manifestation | edition/issue designation | exact |
Expression | other distinguishing characteristic | analogous | |||||
Summary | 500 | $a | general note | Expression | critical response to the expression | partial | |
505 | $a | formatted contents note | Expression | form of expression | partial | ||
520 | $a | summary | Expression | summarization of content | exact | ||
Represen‐tation | 300 | $a | extent | Manifestation | extent of carrier | exact | |
300 | $b | other physical details | Manifestation | extent of carrier | partial | ||
300 | $c | dimensions | Manifestation | extent of carrier | partial | ||
340 | $a | physical medium | Manifestation | physical medium | exact | ||
Manifestation | form of carrier | analogous | |||||
Publication | Place | 260 | $a | place of publication/distribution | Manifestation | place of publication/distribution | exact |
Publisher | 260 | $b | name of publisher/distributor | Manifestation | publisher/distributor | exact | |
Date | 260 | $c | date of publication/distribution | Expression | date of expression | exact | |
Manifestation | date of publication/distribution | exact | |||||
Format | Serial | 310 | $a | current publication frequency | Expression | expected regularity of issue | analogous |
Expression | expected frequency of issue | analogous | |||||
Expression | sequencing pattern | analogous | |||||
321 | $a | former publication frequency | Expression | expected regularity of issue | partial | ||
362 | $a | dates of publication and/or sequential designation | Manifestation | numbering | partial | ||
Manifestation | publication status | partial | |||||
Musical Work | 254 | $a | musical presentation statement | Work | medium of performance | partial | |
Work | numeric designation | partial | |||||
Work | key | partial | |||||
Expression | type of score | partial | |||||
Expression | medium of performance | partial | |||||
Cartographic Work | 255 | all | cartographic mathematical data | Work | coordinates | analogous | |
342 | all | geospatial reference data | |||||
343 | all | planar coordinate data | Work | equinox | partial | ||
Expression | scale | partial | |||||
Expression | projection | partial | |||||
Expression | presentation technique | partial | |||||
Expression | representation of relief | partial | |||||
Expression | geodetic, grid, and vertical measurement | partial | |||||
Computer File | 256 | all | computer file characteristics | non | |||
352 | all | digital graphic representation | Expression | recording technique | partial | ||
Expression | special characteristic | partial | |||||
Expression | technique | partial | |||||
Image | Manifestation | colour | non | ||||
Microform/Visual Projection | Manifestation | reduction ratio | non | ||||
Manifestation | polarity | non | |||||
Manifestation | generation | non | |||||
Manifestation | presentation format | non | |||||
ElectronicResource | Manifestation | system requirements | non | ||||
Manifestation | file characteristics | non | |||||
Manifestation | mode of access | non | |||||
Manifestation | access address | non | |||||
Sound Recording | Manifestation | playing speed | non | ||||
Manifestation | groove width | non | |||||
Manifestation | kind of cutting | non | |||||
Manifestation | tape configuration | non | |||||
Manifestation | kind of sound | non | |||||
Manifestation | special reproduction characteristic | non | |||||
Other Formats | 300 | physical description | Expression | revisability of expression | partial | ||
306 | playing time | ||||||
307 | hours |
*Note. In these tables, exact indicates exact matching between MARC data elements and FRBR entities; analogous indicates analogous matching; partial means partial matching; and non indicates nonmatching.
Proposed Conceptual Structure | Relationship with MARC and FRBR | ||||
Main Class | Class | Subclass | Instance | MARC Field | FRBR Entity |
Author | <Person> | person.name | 100,700 | Person | |
person.title | 100,700 | Person | |||
person.date | 100,700 | Person | |||
person.other | 100,700 | Person | |||
<Corporate Body> | corporate.name | 110,710 | Corporate Body | ||
corporate.location | 110,710 | Corporate Body | |||
corporate.date | 110,710 | Corporate Body | |||
<Meeting> | meeting.name | 111,711 | Corporate Body | ||
meeting.place | 111,711 | Corporate Body | |||
meeting.date | 111,711 | Corporate Body | |||
<Responsibility> | 245 | Manifestation | |||
Title | <Title Statement> | title | title.proper | 245 | Group 1 |
title.remainder | 246 | Group 1 | |||
subtitle | 245,505 | ||||
<Uniform title> | 130 | Work | |||
730 | Work | ||||
240 | Work | ||||
<Series Statement> | series.statement | 440,490 | Manifestation | ||
series.subtitle | 740 | Manifestation | |||
Subject | <Classification No.> | LCC | 050 | Manifestation | |
UDC | 080 | Manifestation | |||
DDC | 082 | Manifestation | |||
<Keyword> | keyword.creator | person | 600 | ||
corporate.body | 610 | ||||
meeting | 611 | ||||
keyword.form | uniform.title | 630 | Work | ||
Genre | 655 | Work | |||
keyword.topic | Chronology | 648 | Work, Expression | ||
651 | Work | ||||
Topic | 650 | Concept | |||
focus.term | 654 | Concept | |||
653 | Work | ||||
Geography | 651 | Concept | |||
Uncontrolled | 653 | Expression, Object | |||
Publication | <Place> | publication.place | 260 | Manifestation | |
<Name> | publisher.name | 260 | Manifestation | ||
<Date> | publication.date | 260 | Manifestation | ||
Identifier | <Identifier> | ISBN | 020 | Manifestation | |
ISSN | 022 | Manifestation | |||
Description | <Edition> | edition.statement | 250 | Expression | |
250 | Manifestation | ||||
<Summary> | note | note | 500 | Expression | |
content.note | 505 | Expression | |||
summarization | 520 | Expression | |||
<Representation> | physical | physical.medium | 300 | Manifestation | |
physical.detail | 340 | Manifestation | |||
extent | 300 | Manifestation | |||
dimension | 300 | Manifestation | |||
Format | <Serials> | serial.publication | serial.frequency | 310,321 | Expression |
serial.date | 362 | Manifestation | |||
serial.sequence | 362 | Manifestation | |||
<Musical Work> | music.statement | 254 | Work, Expression | ||
<Cartographic Work> | map.data | mathematic.data | 255 | Work | |
reference.data | 342 | Work | |||
map.planar | coordinate.data | 343 | Work, Expression | ||
<Computer File> | computer.file | 256 | Expression | ||
computer.graphic | 352 | Expression | |||
<Image> | Manifestation | ||||
<Electronic Resource> | 856 | Manifestation | |||
<Sound Recording> | Manifestation | ||||
<Other Format> | format.description | 300 | Expression | ||
format.playing | playing.time | 306,307 | Expression |
*Note. The proposed conceptual structure contains main classes, subclasses, and some instances only. The instances of each class are not included in this structure to clearly show the core elements of the conceptual structure. Some major instances are included.
Article Categories:
|
Refbacks
- There are currently no refbacks.
© 2024 Core