Controlled Vocabulary Usage
Prepared by: DLP Metadata Implementation Working Group
Date: March 2018
Status: Final Draft
Overview
This document supplements the metadata specifications produced by the DLP Metadata Implementation Working Group (M-IWG). MIWG documentation stored as spreadsheets includes a column for Value Constraints, which indicates if a field uses a controlled term entry. Sources and/or values for controlled terms are documented in greater detail here. The authorities and local terms documented here are starter recommendations, but may need to be be expanded over time. Note: some sections of this documentation will remain incomplete until implementation occurs.
Original Google Document (restricted)
Descriptive Metadata
Final metadata specification worksheet
Institution (D1) - Controlled Terms - External
Recommended values are from Library of Congress Name Authority File (LCNAF). Usual value would be “Emory University”, in future could have additional participating institutions
Data type: URI
Sample value: Emory University
Range of values: http://id.loc.gov/authorities/names.html
Current systems usage: DAMS, DV
Holding Repository (D2) - Controlled Terms - External
Recommended values are from Library of Congress Name Authority File (LCNAF)
Data type: URI
Sample value: James S. Guy Chemistry Library
Range of values: LCNAF
Recommended Emory names (from Core Metadata website)
Current systems usage: DAMS, KEEP, DB?
Administrative Unit (D3) - Controlled Terms - Local
Local terms for different administrative units within the same library
Data type: String?
Sample value: Emory University Archives
Range of values:
Emory University Archives
Stuart A. Rose Manuscript, Archives, and Rare Book Library
Current systems usage: DAMS, KEEP
Content Type (D5) - Controlled Terms - External
Values are from LC Resource Types . For ETD, there is a local list. Supplemental files only. There is a large list of values on Github ( link ), but only 6 available at current release.
Data type: URI, String
Sample value: Still Image
Range of values: LC Resource Types , ETD Local List ( Video; Image; Text; Dataset; Sound; Software)
Current systems usage: DAMS, DB, ETD, OE, Keep
Content Genre (D6) - Controlled Terms - Local/External
Values are from a mix of controlled terms (Getty AAT, LCSH, MARC Genre) and some local (for ETDs?)
Data type: String or URI depending?
Sample value: Astrophotographs
Range of values: Getty AAT ; LCSH ; MARC genre/terms ; ETD local values
Current systems usage: DAMS (Controlled - AAT), DB (Controlled - LCSH), OE (Controlled - MARC Genre Terms), Keep (Controlled - AAT), DV (free-text string)
Primary Language (D10) - Controlled Terms - External
Values are a mix of local lists (DAMS, DV) and controlled terms ( ISO639-2b and the MARC List for Languages )
Data type: URI
Sample value: Latvian
Range of values: ISO639-2b ; MARC List for Languages
Current systems usage: DAMS (currently uses a incomplete local list), DB (Marc List, ETD (local list - English; French; Spanish, plans to expand in future, possibility to ISO list?), OE (MARC List), Keep (no languages), DV (local list, similar to ISO639)
Place of Publication/Production (D28) - Controlled Terms - External
Recommended values from Geonames or other database (such as Getty TGN?)
Data type: URI
Sample value: London
Current systems usage: DAMS, DB, OE, DV
Subjects - Topics (D30) - Controlled Terms - External
Recommended values are from LCSH, Getty Vocabs, FAST (ETDs currently uses Proquest research topics - up to three, minimum of 1)
Data type: URI
Sample value: Surfing
Range of values: Getty AAT ; LCSH ; Getty TGN ; FAST ; Proquest -- Research Topics
Current systems usage: DAMS, ETD, OE, KEEP, DV
Subject - Names (D31) - Controlled Terms - External
Recommended values from LCNAF, VIAF, ULAN
Data type: URI
Sample value: Caillebotte, Gustave (French painter, 1848-1894)
Range of values: LCNAF ; VIAF ; Getty ULAN
Current systems usage: DAMS, DB
Subject - Geographic Names (D32) - Controlled Terms - External
Recommended values from Geonames or other database (such as Getty TGN?)
Data type: URI
Sample value: Beirut (inhabited place)
Current systems usage: DAMS, DB, Keep, DV
Subject - Time Periods (D33) - Controlled Terms - Local/External
Recommended values from database of time periods (LCSH, AAT?)
Data type: String/URI
Sample value: Taisho Era
Current systems usage: DV
Thesis/Dissertation Degree (D39) - Controlled Terms - Local
Subtype of sorts for Submission Type
Data type: String
Sample value: M.T.S
Range of values: ETD local values
Current systems usage: ETD
School (D42) - Controlled Terms - Local
Top level unit
Data type: String
Sample value: Candler School of Theology
Range of values: ETD local values
Current systems usage: ETD
Department/Program (D40) - Controlled Terms - Local
Mid-level unit -
Data type: String
Sample value: Theological Studies
Range of values: (consult with ETD project team for correct configurations )
Current systems usage: ETD
Academic Subfield/Discipline (D41) - Controlled Terms - Local
Lowest level - Certain Department/Programs within Laney and Rollins have subfields
Data type: String
Sample value: Hebrew Bible
Range of values: (consult with ETD project team for correct configurations )
Biology (Laney)
Biostatistics (Rollins)
Business (Laney)
Environment (Rollins)
Epidemiology (Rollins)
Executive Programs (Rollins)
Psychology (Laney)
Religion (Laney)
Current systems usage: ETD
Publisher Version (D44) - Controlled Terms - Local (could investigate external ontology like SPAR)
Data type: String or URI depending?
Sample value: Final Published Version
Range of values:
Preprint: Prior to Peer Review
Post-print: After Peer Review
Final Publisher PDF
Current systems usage: OE
Role - Creator (D45) - Controlled Terms - External/Local?
Recommended values are from either locally created ID’s or from external databases from LCNAF, VIAF, etc...
Data type: String or URI depending?
Sample value: http://id.loc.gov/authorities/names/n2014001558
Range of values: LCNAF ; VIAF , Emory Shared Data, local terms...
Current systems usage: DAMS, DB, ETD, OE, Keep, DV
Role - Contributor (D46) - Controlled Terms - External/Local?
Recommended values from locally created ID’s or from external databases from LCNAF, VIAF, etc...
Data type: String or URI depending?
Sample value: John Doe
Range of values: LCNAF ; VIAF , Emory Shared Data, local terms...
Current systems usage: DAMS, DB, OE, Keep, DV
Role - Thesis/Dissertation Advisor (D47) - Controlled Terms - Local/External?
Free-text currently. Plans to connect to ESD
Data type: String
Sample value: Witte,John
Range of values: N/A
Current systems usage: ETD
Role - Committee Member (D48) - Controlled Terms - Local
Free-text currently. Plans to connect to ESD
Data type: String
Sample value: Smith,Ted A.
Range of values: N/A
Current systems usage: ETD
Role - Degree Granting Institution (D49) - Controlled Terms - Local
Similar to Institution (D1) regarding values. Is stored (for use with Proquest exports) but not displayed locally.
Data type: String or URI (recommended)
Sample value: Emory University
Range of values: http://id.loc.gov/authorities/names.html
Current systems usage: ETD (new)
Role - Sponsor (D50) - Controlled Terms - Local
Locally created IDs or piped from Emory personnel database
Data type: String or URI depending?
Sample value: Harold K. Simon
Range of values: [from Emory Shared Data feed]
Current systems usage: OE
Role - Partnering Agencies (D52) - Controlled Terms - Local
Local list - Rollins only. Can select multiple partnering agencies.
- Data type: String
- Sample value: Centers for Disease Control and Prevention
Range of values: ETD local values
- Current systems usage: ETD
Role - Grant/Funding Agency (D53) - Controlled Terms - Local
Locally created list of terms; if an appropriate data source can be determined we could use an authority instead.
Data type: String?
Sample value: National Institute of Environmental Health Sciences : NIEHS
Range of values:
Current systems usage: OE, DV
Creator/Contributor - Institutional Affiliation (D55) - Controlled Terms - External
Similar to Institution (D1) and Role - Degree Granting Institution (D49)
Data type: URI
Sample value: Spelman College
Range of values: http://id.loc.gov/authorities/names.html
Current systems usage: OE, DV
Rights Statement - Controlled (D57) - Controlled Terms - External
Values will be taken from rightsstatement.org (available by default in Hyrax).
Data type: URI
Sample value: http://rightsstatements.org/vocab/NoC-CR/1.0/
Range of values: http://rightsstatements.org/page/1.0/
Current systems usage: DAMS, Keep
Rights Holder (D58) - Controlled Terms - External/Local
Prefer controlled name entries if available. May need to be locally managed name entries or auto-suggested terms.
Data type: String or URI depending on whether an authority entry exists
Sample value: https://orcid.org/0000-0002-4011-3451
Range of values: ORCID , LCNAF ; VIAF , Getty ULAN , local terms...
Current systems usage: DAMS, KEEP, DB?
Re-use License (D60) - Controlled Terms - External
Values will be taken from Creative Commons, or other vocabularies if applicable (GNU, ODL). Current usage is Creative Commons only. For ETD, might be a future release
Data type: URI
Sample value: https://creativecommons.org/licenses/by/4.0/
Range of values: Creative Commons
Current systems usage: OE, ETD? ( link )
Geographic Unit (D66) - Controlled Terms - Local
Currently free-text, but recommend we either implement the below set of local terms, or configure this field to auto-suggest already stored values.
Data type: String
Sample value: village
Range of values: See recommended values defined in MADS hierarchical Geographic element (define as locally managed list)
continent
country
province
region
state
territory
county
city
citySection
island
area
extraterrestrialArea
Current systems usage: DV
Preservation Metadata
Final Preservation Events/Workflows metadata worksheet
Preservation Event/Workflow Type (PE3)
Controlled terms from LC Preservation Events vocabulary
Data type: URI
Sample value: http://id.loc.gov/vocabulary/preservation/eventType/fix
Range of values: http://id.loc.gov/vocabulary/preservation/eventType.html
Fixity Check
Format Identification
Message Digest Calculation
Metadata Extraction
Quarantine
Replication
Validation
Virus check
Normalization
Capture
Filename Change
Information Package Merging
Metadata Modification
Policy Assignment
Recovery
Redaction
Un-quarantine
Accession
Ingest
Dissemination
Deletion
Local terms added
Version
Decommission
Maintenance
Current systems usage: New/DLP
Initiating User (PE5)
Controlled terms - username or system process name (Names are supplied by the DLP application)
Data type: text or URI (depending on implementation)
Sample value: eporter
Range of values: List of systems/vetted users
Current systems usage: New/DLP
Preservation Event/Workflow - Rights Basis (PE12)
Local Controlled terms
Data type: String
Sample value: Deed of Gift/Sale
Range of values:
Preservation System Policy (default value unless manually overridden)
In copyright
In copyright - Section 108
In copyright - Section 107
Public Domain
License
Deed of Gift/Sale
Institutional Policy
Statute
Administrative Signoff
Current systems usage: New/DLP
File Use Vocabulary - Local Terms
Proposed additions to the PCDM File Use Vocabulary, which is used to relate files within a digital object (used in a preservation package). These terms could be used as relationship entries, or they could serve as a file-labeling scheme, depending on implementation needs.
Data type: URI or free text
Sample value: http://pcdm.org/use#ExtractedText
Range of values: Original PCDM File Use Vocabulary terms:
extracted text
intermediate file
original file
preservation metadata file
service file
thumbnail image
transcript
character positioning data
Additional local terms proposed:
Primary File [content file, e.g. ETDs]
Supplemental File [content, e.g. ETDs]
PREMIS
METS
Supplemental Technical Metadata
Supplemental Descriptive Metadata
Supplemental Source Metadata
License/agreement
Current systems usage: New/DLP
Rights Metadata
Final Rights metadata specification worksheet
Note: additional Rights metadata is documented in the Descriptive and Preservation sections above.
Data Classification (R10)
Categorization of the types of data that may be found in a repository object. Pending Emory IT Security policy development; we will initially use terms provided
- Data type: String
- Sample value: Confidential
- Range of values: See Emory Disk Encryption policy for suggested initial values.
- Public
- Internal
- Confidential
- Restricted
- Applicable systems: N/A (new)
Sensitive/Objectionable Material (R11)
Indicates if the materials contain sensitive or objectionable information.
Data type: String
Sample value: Yes
Range of values:
Yes
No (default)
Current systems usage: New/DLP
Copyright Question #1 [Permissions beyond Fair use...] (R7) (Yes/No)
ETD submission screening question
Data type: String
Sample value: Yes
Range of values:
Yes
No
Current systems usage: ETD
Copyright Question #2 - Does thesis contain content for which you are no longer own copyright… (R8) (Yes/No)
ETD submission screening question
Data type: String
Sample value: No
Range of values:
Yes
No
Current systems usage: ETD
Copyright Question #3 [Patentable Material] (R9) (Yes/No)
ETD submission screening question
Data type: String
Sample value: Yes
Range of values:
Yes
No
Current systems usage: ETD
Embargo (ETDR6) (Yes/No)
If select No, record will be “Open Access”
Data type: String
Sample value: N/A
Range of values:
Yes
No
Current systems usage: ETD
Administrative Metadata
Note: Additional Administrative was inventoried, but the MIWG made the decision not to normalize it as part of the DLP Working Group scope, because it would be impacted by future implementation decisions which may impact staff workflows. As this metadata is finalized, documentation will be added here.
Visibility (AD32)
Supplied for migration or bulk ingest scenarios, to specify collection, object, or file visibility.
Local terms, based on Hyrax visibility options.
Values
Public
Emory Network
Private
[Additional terms TBD as access controls are finalized]
Viewer Settings (AD33)
Supplied for migration or bulk ingest scenarios, to specify visibility and file viewer access controls.
Local terms, to flag IIIF viewer configuration options.
Values:
Standard
Restricted
[Additional terms TBD as access controls are finalized]
Other Controlled Values in New ETD system
The following entries document unique controlled values utilized by the new ETD application, These fields were either not present in the larger M-IWG’s inventory and normalization process due to the transition from old to new ETDs occuring in parallel, or because they are more closely tied to application functionality vs. serving as traditional metadata. Metadata for the 2017 ETD Rewrite Project is partially recorded here .
Embargo Level (ETDR7) - Controlled Terms - Local
Level of embargo of Thesis/Dissertation. How much of the record can be seen by users. Title and Author cannot be restricted
Data type: String
Sample value: Files
Range of values:
Files
Files and Table of Contents
Files, Table of Contents and Abstract
Current systems usage: ETD 2017
Embargo Length (ETDR8) - Controlled Terms - Local
Options depend on school selected. Each school has different embargo length options
Data type: String
Sample value: 6 Months
Range of values:
6 Months
1 Year
2 Year
6 Years
Current systems usage: ETD 2017
Graduation Dates (ETD)
Graduation Dates - new values added by developers as time goes on
Data type: String
Sample value: Fall 2018
Range of values:
Fall 2017
Spring 2018
Summer 2018
…
Fall 2020
Current systems usage: ETD 2017
Submission Type (ETD)
Submission Type - Type of Degree granted. Umbrella field for Degree
Data type: String
Sample value: Dissertation
Range of values:
Honor’s thesis
Master’s thesis
Dissertation
Current systems usage: ETD 2017