Metadata for Digital Collections


Noah Geraci / July 19, 2017

UCR Library
Metadata and Technical Services

Getting started

Green post-it: a question you have

Blue post-it: something you'd like to learn

Workshop guidelines

  • One person talks at a time
  • No feigning surprise
  • No "well-actually"s

(inspired by the Recurse Center social rules)

What we'll cover

  • General overview: myths and facts, vocabulary, digital collections at UCR
  • Example: following a digitized archival photo
  • Tools and OpenRefine demo

Myth

You have to be a technology whiz to work with digital collections metadata.

Fact

Liking to solve problems, learn and experiment is more important than any specific tech skill.

Myth

Being a cataloger and being a metadata specialist are two totally different jobs and skillsets.

Fact

If you’ve cataloged in MARC, you’re already familiar with one major metadata standard, and with concepts like authority control.

What is metadata?

collection of images showing different kinds of metadata, including a card catalog, an EAD finding aid, iTunes metadata, EXIF metadata from a digital camera, a MARC record, and a headline about NSA metadata collection

…so what are we talking about today?

Information we create according to standards and best practices to “arrange, describe, track, and otherwise enhance access to information objects”

Anne Gilliland, “Setting the Stage,” Introduction to Metadata, ed. Murtha Baca, p. 2

Digital collections at UCR

Digitized archival collections · Extended Dublin Core in Nuxeo, published to Calisphere
Web archives · Dublin Core in Archive-It
Born-digital monographs (non-platform) · MARC record mapped to minimal Dublin Core in Nuxeo

Vocabulary

schema: a plan showing relationship between metadata elements, including semantics, syntax, and optionality. Also called element set, scheme.

crosswalk: a table that shows equivalent elements or fields in multiple schema, used to transform metadata from one schema to another, i.e. MARC to MODS, EAD to Dublin Core, etc.

tabular data: data represented in a table-like structure with rows and columns, such as an Excel file, CSV, TSV

What makes good digital collections metadata?

  1. Conforms to community standards
  2. Supports interoperability
  3. Uses authority control for description and collocation
  4. Makes conditions of use of digital objects clear
  5. Supports long-term preservation
  6. Metadata records are objects themselves:
    • authority, authenticity, archivability, persistence, unique identification

National Information Standards Organization, A Framework of Guidance for Building Good Digital Collections, 3rd edition. 2007.

Metadata example:

Photo from the Tomás Rivera archive

black and white candid photo of Tomas Rivera eating lunch with a group of students

Finding aid:

Folder-level description

screenshot of finding aid, reads 'Folder 11, UCR Student Activities 1979-84'
same image of Tomas Rivera eating lunch with students, shown in context on Calisphere with title 'Tomas Rivera and students at MEChA day
screenshot of Calisphere metadata, can be viewed in text format at link https://calisphere.org/item/c8bdcdff-f180-477d-a85d-75f31534276d/
screenshot of Calisphere metadata, can be viewed in text format at link https://calisphere.org/item/c8bdcdff-f180-477d-a85d-75f31534276d/

Nuxeo

Nuxeo spreadsheet

Anatomy of a file name:

curivsc_253_005_001_011_001

curivsc: repository code

253: collection number 253

005: series 5

001: box 1

011: folder 11

001: item 1

Metadata tools

“The computer is incredibly fast, accurate, and stupid. Man is incredibly slow, inaccurate, and brilliant.”

Leo Cherne, 1977

A few tools

  • Excel/Google Sheets
  • OpenRefine
  • Oxygen (XML editor)
  • pandas (Python library)
  • Plain-text editor
    (Notepad, Notepad++, SublimeText, many more... )
  • MARCEdit

OpenRefine

Wrapping up

Green post-it: a question you have

Blue post-it: something you'd like to learn more about

General Resources

UC Resources