Preparing DAMS-Managed Collections for Ingest
- Emily Porter
- Kathryn Michaelis
Note: this document is still under development; additional sections forthcoming.
Overview
If you are ingesting a collection that currently exists in the DAMS, you are well on your way to repository ingest!
The DAMS ingest process includes the following steps, some of which will require assistance from the LTDS team:
- Preparation of object-level metadata spreadsheet, including filenames and file paths
- Preparation of collection-level metadata spreadsheet
- File transfer of all needed files, preserving the directory structure recorded in the DAMS file paths
- Curate bulk import process
Metadata Preparation
The following DAMS metadata fields are required for ingest into the repository:
- Desc - Title
- Desc - Holding Repository
- Desc - Rights Statement
- Desc - RightsStatements.org Designation (URI)
- Digital Object - Data Classification
- Digital Object - Parent Identifier
- Digital Object - Visibility
- Desc - Type of Resource
- Path
Additional DAMS fields are configured for bulk import through our spreadsheet-based ingest process. See the DAMS to Curate mapping for more information.
Reformatting DAMS-Exported Spreadsheets for Curate Ingest
As an initial step for bulk-ingest, you must export your collection’s metadata from the DAMS into a spreadsheet. The exported spreadsheet will require some additional reformatting to become ingest-ready. Note that many of the columns included in the initial DAMS export will not be used in the final ingest. For assistance with exporting or reformatting spreadsheets, contact LTDS or Metadata Services. See this example of a DAMS export which has been reformatted to be Curate-ingest ready.
- Make an extra copy of the exported spreadsheet DAMS with all of its data. It is recommended to retain the original DAMS export file in a shared drive location accessible to LTDS (e.g. Box or Onedrive) as well as the reformatted copy.
- Ensure that all spreadsheet columns are formatted as text (preventing Excel and other spreadsheet editors from reformatting date values).
- Hide or remove extraneous DAMS columns that are unnecessary for import.
- Ultimately, only a few columns exported from DAMS will be used by the importer. To make the spreadsheet easier to navigate, it is recommended to hide or delete these unused columns.
- See the DAMS to Curate Metadata Mapping document to review which columns are used by the bulk import tool.
- Reformat any date fields into Extended Date Time Format (EDTF):
- DAMS fields:
- Desc - Date Created
- Desc - Date Published
- Rights - Copyright Date
- Rights - Access Basis - Review Date
- Rights - Digitization Basis - Review Date
- Workflow - Date Digitized
- DAMS fields:
- Consolidate the various LCNAF name entries into a new single column. Concatenate the values of the following DAMS fields into a single column, using pipes as delimiters.
- New column name: subject_names
- DAMS fields:
- Desc - Subject - Personal Name - LCNAF
- Desc - Subject - Corporate Name - LCNAF
- Desc - Subject - Meeting Name - LCNAF
- Concatenate the following DAMS identifier values into a new single column, using pipes as delimiters. Add a prefix to each identifier (e.g. “barcode:”, “dams:”, etc. to distinguish between different types of local identifiers.
- New column name: other_identifiers
- DAMS fields:
- Item ID
- Desc - Legacy Identifier
- Make sure a value is populated in the Digital Object - Parent Identifier column. This identifier will be used for deduplication in the bulk import process.
- Make sure selected DAMS columns’ values are entered as URIs (instead of their equivalent text labels).
- DAMS fields:
- Desc - Type of Resource (using URI values from the Resource Types scheme)
- Desc - RightsStatements.org Designation (URI)
- DAMS fields:
- Rename the original DAMS column headers to use Curate’s fieldnames as the CSV headers.
- See the DAMS to Curate mapping to match DAMS headers to the CSV import headers.
- To assist with this process, it may be helpful to create a row beneath the original DAMS header row, add the new Curate field names as headers in the second row, and then delete the original DAMS header row once finished.
- Review the spreadsheet columns again and delete any that will not be needed for import.
- Save the modified DAMS export as CSV (UTF-8 encoded).
- Upload the Curate-formatted CSV to Box or Onedrive.
- Upload the original, unmodified DAMS export CSV to Box or Onedrive.
DAMS Filename Conventions for Bulk Import
The Curate bulk-import process is optimized to work with the following filename conventions in use within DAMS-managed collections. If your collection's files use a different convention, please contact LTDS for support.
While file naming practices may vary, it is strongly recommended that all filenames contain or end with a numeric part sequence, such as "P0001.tif".
Digitized Photographs/Post Cards
In these types of outputs, there may be two files for each component part, one which contains color bars (_ARCH) and one which is cropped and straightened (_PROD).
Example Collection: Langmuir Photograph Collection
Filename Examples:
- MSS1218_B071_I205_P0001_ARCH.tif
- MSS1218_B071_I205_P0001_PROD.tif
- MSS1218_B071_I205_P0002_ARCH.tif
- MSS1218_B071_I205_P0002_PROD.tif
Curate import notes:
- For works that have two sets of images, these files will be grouped, ordered and labeled as "Front" or "Back".
- For works that have more than two sets of images, these files will be grouped, ordered and labeled as "Image 1", "Image 2", "Image 3" etc.
Image Sequences or Views of an Object
In these types of outputs, there is typically only one file for each part (_ARCH and _PROD are not included in the filename scheme), and these sequences are not based on fronts or backs.
Example Collection: Health Sciences Artifact Collection
Filename Examples:
- HS-S023_B071_P001.tif
- HS-S023_B071_P002.tif
- HS-S023_B071_P003.tif
- HS-S023_B071_P004.tif
Curate import notes:
- These files will be ordered and labeled as "Image 1", "Image 2", "Image 3" etc.
Page Contents: