Pull It Together! Presentation on Exporting, Augmenting and Re-Ingesting Data in Fedora 3 and 4

Update, I redid my presentation as an article published in issue 31 of Code4Lib Journal.

At the DC Fedora Users Group meeting (October 7-8), I presented on a project I’d done at work on data in our repository. You download my presentation PDF and PowerPoint along with code samples and walkthroughs on the presentation’s GitHub repository. The PowerPoint/PDF contain a textual version of my presentation as speaker notes so you can read along with the slides. In the project, I used the data from one collection in our repository (Authors & Publications) to augment a second collection (Colloquia). The presentation/process may be of interest to anyone interested in creating a local reconciliation service, updating/augmenting their RELS-EXT in Fedora 3, generally batch-updating Fedora 3, or seeing how updates work in Fedora 4. The presentation covers 4 steps by which I:

  1. Exported local author data from our Fedora repository, created a Turtle/RDF file with all author data, and added the file as an RDF-based reconciliation service to OpenRefine with the DERI RDF plugin.
  2. Crosswalked gSearch results into a CSV with a row per dc:creator entry in a colloquia record.
  3. Reconciled the dc:creators in OpenRefine with the service created in step 1.
  4. Created batch ingest files for both Fedora 3 and Fedora 4, including challenges using the Fedora Batch Modify language.

The end product was an updated RELS-EXT or RDF dc:creator relationship which connected Colloquia record data about speakers who had an author record in the Authors & Publications collection to that author record. Through inverting the SPARQL search used the generate queries, this also would allow Author pages to list the Colloquia that author had delivered.