Appendix 1 - Archivists' Toolkit Import

This is the appendix to Ruth Kitchin Tillman and Molly Schwartz’s 2012 MARAC Presentation on Archivists’ Toolkit and EAD.

In comparing the University of Maryland’s EAD file for the Richard White collection and the Archivists’ Toolkit record created from import along with its export file, several problems were noticed which are not germane to the paper’s topic, but which may be of interest to those who are learning Archivists’ Toolkit in order to manage EAD records already in existence. These issues are outlined in this appendix.

A. Multiple date fields

  1. The program only imported the first date from the <publicationstmt> field.

  2. The program imported both the @type=inclusive and @type=bulk fields from <unitdate> but conflated them into the same field, yielding an odd-looking “1905-1920 1905-1920.”

B. MARC fields and @encodinganalog

Archivist’s Toolkit did not import MARC fields from @encodinganalog, nor does it have the option to include the encodinganalog attribute in its EAD export files. The MARC XML export however, seemed to properly interpret the data and corresponded with the encodinganalog as specified in the imported file.

C. Folder hierarchy

The folders’ @parent field did not import, although their folder type, number, and nesting structure did.

D. Manipulated data

  1. In its EAD export file, Archivist Toolkit adds the text “Finding aid processed by” directly after its <author> tag. This will cause problems for any imported file which included more than the author(s) name(s) in that field and annoyance to anyone who does not want additional text in that area.
  2. Archivist’s Toolkit removes double-spacing after periods, in deference to current rules of style and, more importantly, XML rules for data not designated as CDATA.
  3. For note fields which did not have headers, AT added a header with the Note’s title inside a <head> tag.
  4. Although nested <bibliography> tags are permitted in EAD, the nested tags were imported as separate Bibliography notes.

Note: The effect achieved by the sample import’s <bibliography><p></p><bibliography>list of items</bibliography></bibliography> structure, in which <p> contained information about the bibliography’s contents could have been achieved in AT using the Note field inside the Bibliography note entry box. The structure would have exported as <bibliography><p></p>list of items</bibliography>. But while it is possible to achieve this in AT, that would still require the importing archivist to do additional work after import.

E. Access Control

The final major data change during the the import was within <controlaccess>. The original file contained three <controlaccess> fields, which were structured as paragraphs with subject, person, and corporate names. Archivists’ Toolkit imported all the data within <subject>, <persname>, and <corpname> into the “Names and Subjects” section with @source=ingest. The data in the paragraphs was discarded.

While it is important to note all these issues and to strongly encourage archivists considering doing an import to attempt a few first and discover any major differences, the majority of the file imported without any loss of or change to the data.