...
The following spreadsheet template shows the required or recommended for formatting for a Curate-ready pull-list. While the pull-lists prepared during the digitization and review process may vary, the following columns are required for Curate's bulk import method. For information about metadata requirements, see the Cor Metadata Field Usage documentation.
...
Column Heading | Explanation |
---|---|
Item ID | A numeric ID for each individual work in the spreadsheet(e.g. the original row number). Recommended for cross-referencing across pull-list versions later. |
deduplication_key* | A unique ID for each individual volume in the collection; typically an ARK or barcode number |
other_identifiers | concatenated list of other local identifiers e.g. barcodes, digwf IDs, OCLC, etc. Identifiers should contain a prefix indicating their type, and multiple values should be separated by pipes |
emory_ark | Emory ARK id, if applicable |
system_of_record_ID* | Alma MMSID |
institution* | Name(s) of institutions providing the material, e.g. Emory University |
holding_repository* | Name of Library providing the material |
administrative_unit | Name of administrative unit within the Library, if applicable |
CSV Call Number | Call number will be supplied from Alma, but it is useful to have this on the pull-list for reference. |
Enumeration | Volume-level enumeration, if applicable (e.g. Volume 1, Copy 1, Edition etc.) |
CSV Title | Title will be supplied from Alma, but it is useful to have this on the pull-list for reference. |
content_type* | Supplied as URI. Recommended value: http://id.loc.gov/vocabulary/resourceTypes/txt |
emory_rights_statements* | The Emory Libraries supplied rights statement |
internal_rights_note | Additional internal rights notes or documentation |
rights_statement* | Supplied as URI from rights statement.org values, e.g. http://rightsstatements.org/vocab/NoC-US/1.0/ |
visibility* | See available access controls (Public, Public Low View, Emory Low Download, Rose High View, Private) |
data_classifications* | Emory defined data classification type: Public, Confidential, Internal, Restricted |
sensitive_material | Indicate "Yes" if the volume contains sensitive material |
sensitive_material_note | Provide additional context for any sensitive material determination |
transfer_engineer | The name of the digitization technician |
date_digitized | The date of digitization for the volume (EDTF format) |
Barcode* | This is used to generate certain volume-level filenames |
Base_Path* | The base directory path where content files are stored on the server |
MBytes* | The overall file size for all content files in the work |
PDF_Path** | The base directory path for volume-level PDF file for the work |
PDF_Cnt** | The count of PDF files to be imported |
OCR_Path** | The base directory path for volume-level OCR file for the work |
OCR_Cnt** | The count of volume-level OCR files to be imported |
Disp_Path* | Directory containing the page level image files (TIFFs) > Primary Content: Preservation Master File |
Disp_Cnt* | The count of page-level image files to be imported |
Txt_Path** | Directory containing the page level plain text files > Primary Content: Transcript File |
Txt_Cnt** | The count of page-level text files to be imported |
POS_Path** | For Kirtas outputs: directory containing the page level POS files > Primary Content: Extracted Text File |
POS_Cnt** | For Kirtas outputs: count of page level POS files to be imported |
ALTO_Path** | For LIMB outputs: directory containing the page level Alto XML files > Primary Content: Extracted Text File |
ALTO_Cnt** | For LIMB outputs: count of page-level ALTO xml files to be imported |
METS_Path** | For LIMB outputs: directory for volume-level METS file to be imported |
METS_Cnt** | For LIMB outputs: count of volume-level METS file to be imported |
Accession.workflow_rights_basis | Rights basis determination (e.g. Public Domain) for digitization |
Accession.workflow_rights_basis_date | Date of rights review (EDTF format) |
Accession.workflow_rights_basis_reviewer | Name of individual or office performing rights review |
Accession.workflow_rights_basis_note | Rights-related notes about digitization/preservation |
Accession.workflow_notes | General notes about digitization/preservation or aquisition |
Ingest.workflow_rights_basis | Rights basis determination (e.g. Public Domain) for digitization/preservation |
Ingest.workflow_rights_basis_date | Date of rights review (EDTF format) |
Ingest.workflow_rights_basis_reviewer | Name of individual or office performing rights review |
Ingest.workflow_rights_basis_note | Rights-related notes about ingest or migration |
Ingest.workflow_notes | General notes about ingest or migration, e.g. Migrated to Cor repository from LSDI Kirtas workflow during Phase 1 Migrations, 2019 |
...
- Kirtas outputs
- LIMB outputs
- [Barcode#].pdf
- [Barcode#].mets.xml
Page-Level Files
The Curate book import preprocessor makes the following assumptions:
...