Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The following spreadsheet template shows the required or recommended for formatting for a Curate-ready pull-list. While the pull-lists prepared during the digitization and review process may vary, the following columns are required for Curate.

Required fields are indicated with an asterisk.

Note: additional metadata will also be extracted from Alma/MARC catalog records; the following fields are recommended for the pull-list itself.

Column HeadingExplanation
Item IDA numeric ID for each individual work in the spreadsheet(e.g.row number)
deduplication_key*A unique ID for each individual volume in the collection; typically an ARK or barcode number
other_identifiersconcatenated list of other local identifiers e.g. barcodes, digwf IDs, OCLC, etc.
emory_arkEmory ARK id, if applicable
system_of_record_ID*Alma MMSID
institution*Name(s) of institutions providing the material, e.g. Emory University
holding_repository*Name of Library providing the material
administrative_unitName of administrative unit within the Library, if applicable
CSV Call NumberNot required but useful for reference. Call number will be supplied from Alma.
Enumeration
CSV Title
content_typeOCLC NumberALMA MMSIDBarcodeDigWF ID*
emory_rights_statements*
internal_rights_note
rights_statement*
visibility*
data_classifications*
sensitive_material
sensitive_material_note
transfer_engineer
date_digitized
Base_PathThe base directory path where content files are stored on the server
MBytesThe overall file size for all content files in the work
PDF_PathThe base directory path for volume-level PDF file for the work
PDF_CntThe count of PDF files to be imported
OCR_PathThe base directory path for volume-level OCR file for the work 
OCR_CntThe count of volume-level OCR files to be imported
Disp_PathDirectory containing the page level image files (TIFFs) > Primary Content: Preservation Master File
Disp_CntThe count of page-level image files to be imported
Txt_PathDirectory containing the page level plain text files > Primary Content: Transcript File
Txt_CntThe count of page-level text files to be imported
POS_PathFor Kirtas outputs: directory containing the page level POS files > Primary Content: Extracted Text File
POS_CntFor Kirtas outputs: count of page level POS files to be imported
ALTO_PathFor LIMB outputs: directory containing the page level Alto XML files > Primary Content: Extracted Text File 
ALTO_CntFor LIMB outputs: count of page-level ALTO xml files to be imported
METS_PathFor LIMB outputs: directory for volume-level METS file to be imported
METS_CntFor LIMB outputs: count of volume-level METS file to be imported
Accession.workflow_rights_basis
Accession.workflow_rights_basis_date
Accession.workflow_rights_basis_reviewer
Accession.workflow_rights_basis_note
Ingest.workflow_rights_basis
Ingest.workflow_rights_basis_date
Ingest.workflow_rights_basis_reviewer
Ingest.workflow_rights_basis_note
Ingest.workflow_notes

...