...
The following spreadsheet template shows the required or recommended for formatting for a Curate-ready pull-list. While the pull-lists prepared during the digitization and review process may vary, the following columns are required for Curate.
Required fields are indicated with an asterisk.
Note: additional metadata will also be extracted from Alma/MARC catalog records; the following fields are recommended for the pull-list itself.
Column Heading | Explanation | |||||
---|---|---|---|---|---|---|
Item ID | A numeric ID for each individual work in the spreadsheet(e.g.row number) | |||||
deduplication_key* | A unique ID for each individual volume in the collection; typically an ARK or barcode number | |||||
other_identifiers | concatenated list of other local identifiers e.g. barcodes, digwf IDs, OCLC, etc. | |||||
emory_ark | Emory ARK id, if applicable | |||||
system_of_record_ID* | Alma MMSID | |||||
institution* | Name(s) of institutions providing the material, e.g. Emory University | |||||
holding_repository* | Name of Library providing the material | |||||
administrative_unit | Name of administrative unit within the Library, if applicable | |||||
CSV Call Number | Not required but useful for reference. Call number will be supplied from Alma. | |||||
Enumeration | ||||||
CSV Title | ||||||
content_type | OCLC Number | ALMA MMSID | Barcode | DigWF ID | * | |
emory_rights_statements* | ||||||
internal_rights_note | ||||||
rights_statement* | ||||||
visibility* | ||||||
data_classifications* | ||||||
sensitive_material | ||||||
sensitive_material_note | ||||||
transfer_engineer | ||||||
date_digitized | ||||||
Base_Path | The base directory path where content files are stored on the server | |||||
MBytes | The overall file size for all content files in the work | |||||
PDF_Path | The base directory path for volume-level PDF file for the work | |||||
PDF_Cnt | The count of PDF files to be imported | |||||
OCR_Path | The base directory path for volume-level OCR file for the work | |||||
OCR_Cnt | The count of volume-level OCR files to be imported | |||||
Disp_Path | Directory containing the page level image files (TIFFs) > Primary Content: Preservation Master File | |||||
Disp_Cnt | The count of page-level image files to be imported | |||||
Txt_Path | Directory containing the page level plain text files > Primary Content: Transcript File | |||||
Txt_Cnt | The count of page-level text files to be imported | |||||
POS_Path | For Kirtas outputs: directory containing the page level POS files > Primary Content: Extracted Text File | |||||
POS_Cnt | For Kirtas outputs: count of page level POS files to be imported | |||||
ALTO_Path | For LIMB outputs: directory containing the page level Alto XML files > Primary Content: Extracted Text File | |||||
ALTO_Cnt | For LIMB outputs: count of page-level ALTO xml files to be imported | |||||
METS_Path | For LIMB outputs: directory for volume-level METS file to be imported | |||||
METS_Cnt | For LIMB outputs: count of volume-level METS file to be imported | |||||
Accession.workflow_rights_basis | ||||||
Accession.workflow_rights_basis_date | ||||||
Accession.workflow_rights_basis_reviewer | ||||||
Accession.workflow_rights_basis_note | ||||||
Ingest.workflow_rights_basis | ||||||
Ingest.workflow_rights_basis_date | ||||||
Ingest.workflow_rights_basis_reviewer | ||||||
Ingest.workflow_rights_basis_note | ||||||
Ingest.workflow_notes |
...