After having worked on it for the last 7 months or so, I’ve finally finished creating EADiva, a site which functions as a friendlier version of the EAD tag library. In my introductory post on its blog, I note that this isn’t a replacement for the Library of Congress’s tag library or excellent resources.

I used the resources from the Library of Congress to create to create the site, although many examples are my own. Each page links to its comparable Library of Congress page. However, the site has inter-linking between element pages, cutting way down on navigation time, and spells out each attribute on the pages themselves instead of requiring the user to click over to other navigation pages.

It was a heck of a job, but it taught me a great deal about EAD and I expect to learn even more as I continue the project.

Share on Twitter Share on Facebook Bookmark on Delicious Reddit! Share on Tumblr

I’m working on my field study right now, arranging and describing a collection which had previously been kept in piles on shelves and in boxes. One of my favorite parts of going through the materials is putting together stories which emerge in the materials. Sometimes, because I’m describing at the item level, I’m able to convey the story within the finding aid itself. For example, from the documents of the library committee:

  • Letter from Professor [redacted] re: library staff member’s refusal to disclose circulation records, 1987
  • Letter from [library director] to Professor [redacted] re: confidentiality of library circulation records, 1987
  • Letter from [library director] to Dean [redacted] re: confidentiality of library circulation records, 1987
  • “Academic Libraries Must Oppose Federal Surveillance of Their Users” copied from Chronicle of Higher Education, 1988
  • Article “The FBI in the Library,” 1988
  • “Librarians Challenge FBI on Extent of Its Investigation” copied from News of the Week¸ probably 1988
  •  “Librarians Want FBI to Shelve Requests About Foreign Readers,” 1988
  • Volume 37, number 1, Newsletter on Intellectual Freedom, 1988
  • “Talk of the Town” New Yorker, 1988 (Note: Concerning the Library Awareness Program)

(I had to mess slightly with my finding aid format to make this blog right, but the point comes across.)

The contents of the letters and then the slew of materials on library confidentiality, albeit primarily FBI investigations, make me imagine that the library director passed these on to the challenging professor and perhaps the rest of the faculty and deans. Am I sure? No. This is the original order of materials as left by that library director and it may simply mean that the director was working to raise library committee awareness of the subject and formulate policies. But it’s fun to see a story within in the materials.

Share on Twitter Share on Facebook Bookmark on Delicious Reddit! Share on Tumblr

In LBSC 605, Intro to Archives, I did a literature review of articles on blog archival. I found so little that dealt with actual blogging that I had to expand it to blogs and dynamic websites. It was a bit disappointing, but preparing that review reminded me of a little blog that I wanted to save.

The Blog

In the fall of 2004, my mother was diagnosed with terminal cancer. One of her many concerns became the preservation of family stories, mostly the ones she’d told us as kids or the ones which had been told her by older relatives who were already gone. At my sister’s suggestion, she began blogging the stories in early 2005 using Xanga.

The blog consists of 19 posts from 2005 to mid-2006. Her posting frequency was affected by her treatments and she eventually began writing down the stories by hand in her free time, which she found easier than sitting at a computer. It’s an incomplete scrap of a blog, but it’s also 19 stories which I’m not sure she duplicated elsewhere. The blog hasn’t been touched since 2006.

My goals is to end up with three versions:

  1. Archive of the site in the full HTML form it has on the web (i.e. the pure & untouched files, but containing a lot of unnecessary or xanga.com-dependent code).
  2. Functional and self-contained local site with unnecessary code redacted, internal linking changed to reflect local site file names, local scripts and files substituted for linked xanga scripts and files, and anything else necessary make the blog usable independently and with no internet connection.
  3. Content of the posts extracted into a plaintext document and a PDF document.

I want to preserve the original site form, but in order to make it function offline I’ll have to do some major editing. This shouldn’t be a real problem once I determine the layout (see below). Since it’ll require major editing, I plan to keep a zipped copy of the original files so that a) I could start over and b) the original is retained. I also realize that a representation of the site in its original form is probably not the most useful tool for someone who simply wants to read my mother’s stories. Therefore I’ve decided to copy and paste the post content into two document formats that should be fairly accessible.

Putting it in a document form should take all of an hour or two. In fact, it’s what most people would do instead of archiving the site. However, I’d like to use this as an opportunity to learn about the processes and challenges of blog archival. As it’s a small blog, I hope not to be overwhelmed by the size of the project.

My Initial Steps

1) Identify all pages that need to be saved. According to the blog archive, the blog contained 19 posts. However, a blog is more than its individual post pages.

First, there were the index page and “Next 5″ pages. As is customary in blog format, each contained the full content of 5 posts in reverse chronological order (most recent posts on the front page, etc). I saved the four instances of the index page (for 19 posts and 5/page, this came to 3 pages of 5 posts and one of 4).

Then there were individual monthly archives pages. The format was: Main Archives Page (linked to on each page of the blog) -> Monthly Archives Page (containing links to individual posts). In order to create a holistic representation of the original site, I saved each of these instances as well.

2) Determine what code needed replacement and what could be deleted. After I saved all the original site HTML, I started looking at the page structure. Some of the display is dependent on external files from xanga.com. Other code is unnecessary to display (such as tracking scripts or scripts which allow one to post comments).

At this point, I’m still determining the overall structure. I expect that my next step will be coming up with a plan for restructuring the files and possibly a step-by-step process to make it go faster while not missing anything important. I will also need to save any necessary scripts and CSS files which are stored on xanga itself.

The Initial Challenges

1) File naming conventions: Xanga follows a postid/post-slug/ format. I needed an easy way to name the HTML files so that I could look at an internal blog link and know how to link it to the appropriate .html file on the local site.

I used the format postid-post-slug.html to name the files so that I could easily alter internal links by just deleting the domain and slightly changing the format, then adding .html. Some posts weren’t titled externally and therefore didn’t have post slugs. If the post didn’t have a slug, it just became postid.html.

For the index pages, I used index.html to indicate the blog’s front page at the time of archiving and index-2.html, etc, on subsequent files.

2) Defining necessary elements of the page: Part of step 2 involves making judgment calls on what’s really a part of the site and what’s extraneous data which can be lost. Unseen and database-reliant scripts can easily be removed, but what about elements like a comment box? I plan to retain comments left by the friends and family who visited it, but is the comment box necessary as an example of how comments were entered? It doesn’t serve the family members, but might it serve people down the road if someone were to look at this site as a sample of a 2005-2006 blog? Or would that person be looking at the original files anyway and not need it to display here?

What about elements like the xanga sign-in link? It’s no longer necessary or functional on the site. Or what about elements which were probably added later by changes to some of the scripts, like Twitter or Facebook sign-in links in the comment section? Those were certainly added after the blog was abandoned in 2006.

I’m overthinking this part, mostly because I’m weighing how I might do things differently if I were saving the blog for an archive with a broad user base instead of a small family group. in fact, after working on Goal 2 for a while, I decided I should probably skip ahead to Goal 3 next time I have the time to work on it and extract the post content into usable files.

To be continued, as the semester allows…

Share on Twitter Share on Facebook Bookmark on Delicious Reddit! Share on Tumblr

Tagged as: practical, xanga project

For the first assignment in Organization of Information, we had to catalog a resource in MARC & Dublin Core. In MARC, we were required to use certain fields, such as 007/8, 1xx, 245, 5xx, 65x. Obviously, we weren’t allowed to just go to another library and look at what they’d done. I decided to challenge myself a bit and catalog Locke & Key: Keys to the Kingdom, volume 4 in a graphic novel/comic series by Joe Hill (writer) and Gabriel Rodriguez (illustrator).

Challenges using Dublin Core

I did the first record in (qualified) Dublin Core, since its documentation wasn’t as extensive as MARC. I also thought that starting there would give me a good record to adapt into the MARC format. I ran into a handful of challenges (which also proved challenges in MARC), most of which I solved.

  1. This is the 4th book in a series.
  2. The book has two creators (writer and illustrator), whom I wanted to give shared authorial credit. This may not be appropriate on all graphic novels, but for this series, it seemed like the best choice.
  3. The book has two other major contributors (letterer and colorist) who should get some degree of credit.
  4. Dublin Core doesn’t seem to have a way of indicating page numbers.

1) I used both Locke & Key v.4 and Locke & Key to indicate the series it was part of and its place in the series. I’m not sure if dcterms:isPartOf needed more information, but I worried putting in more than the series name could create problems for automatically-generated links.

2 & 3) I was only able to half-solve these. On the one hand, I indicated a kind of hierarchy between the two sets of creators/contributors by using the Dublin Core terminology dc:creator and dc:contributor. I listed the writer & illustrator as creators and the letterer & colorist as contributors.

Unfortunately, there was no easily-apparent way to designate each of their roles. It would probably require creating an alternative xml schema to augment DC. That would only be an appropriate step if one were creating a catalog of comics and graphic novels or working in a context where the user had a need to know these specifics. Still, I was disappointed. I considered trying to include it in the description, but that didn’t look right.

4) I wasn’t able to solve this issue at the time. Since turning in the assignment, I realized that I should’ve used dcterms:sizeOrDuration. The name made me think of physical sizes, of electronic record sizes, & of AV materials. The definition talks of defitions, extents, and times to play/execute. But the comment is:

Examples include a number of pages, a specification of length, width, and breadth, or a period in hours, minutes, and seconds.

which fits quite well. I’ve updated the record below to include this.

Some other choices I made on the record include:

  1. including “young adult” as one of the subjects, even though the book suggests that it’s for mature audiences and making “graphic novel” a subject;
  2. including a table of contents, despite this being a work of fiction;
  3. including dcterms:isFormatOf to indicate that it was originally published in 6 issues;
  4. and calling the format a graphic novel instead of a book or any other more generic term.

1) The book contains mature themes and definitely has adult appeal, but the main characters are teenagers (and a child). It’s not more “mature” than many other books shelved in YA and the teens may be able to identify with feelings of loss and identity as experienced by the characters. My biggest concern was that this might make it harder for adults to come across a copy while browsing.

I used “graphic novel” as a subject, because it seemed like it could be useful in certain OPACs. I’m hoping to get feedback on this one way or another from the professor.

2) Because the book was originally published as comics, it seemed useful to create a table of contents reflecting the title of each comic (which are retained as chapter titles in the book, though the last chapter is indicated as part 1 & part 2 and not as dramatically delineated as the rest). Readers may have previously encountered single issues.

3) See above.

4) Other than the debatable subject field I used, there wasn’t any explicit statement that this was a graphic novel. “Comic” and “comic collection” didn’t seem appropriate. It might be useful in an OPAC which allows one to search by format.

Locke & Key: Keys to the Kingdom — cataloged using Dublin Core

<?xml version="1.0"?>
<qualifieddc xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xsi:noNamespaceSchemaLocation="http://dublincore.org/schemas/xmls/qdc/2008/02/11/qualifieddc.xsd"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:dcterms="http://purl.org/dc/terms/">
<dc:title>Locke &amp; Key: Keys to the Kingdom</dc:title>
<dcterms:alternative>Keys to the Kingdom</dcterms:alternative>
<dcterms:alternative>Locke &amp; Key v.4</dcterms:alternative>
<dc:creator>Joe Hill</dc:creator>
<dc:creator>Gabriel Rodriguez</dc:creator>
<dc:contributor>Jay Fotos</dc:contributor>
<dc:contributor>Robbie Robbins</dc:contributor>
<dc:subject>graphic novel, comic, horror, fantasy, young adult</dc:subject>
<dc:description>The Locke children have grown accustomed to the myriad of magical keys discovered within the ancestral family home of Keyhouse. They have also grown accustomed to tragedy. What they may not be prepared for is just how closely danger stalks their every move as Lucas Caravaggio, alias Zach Wells, continues his relentless quest for the key to the black door.</dc:description>
<!--description taken from back cover material-->
<dcterms:tableOfContents>Sparrow; White; February; Casualties; Detectives: Part 1; Detectives: Part 2</dcterms:tableOfContents>
<dcterms:sizeOrDuration>160 pages</dcterms:sizeOrDuration>
<dcterms:issued>2011-06-01</dcterms:issued>
<dcterms:isFormatOf>Originally published as Locke &amp; Key: Keys to the Kingdom issues #1-6.</dcterms:isFormatOf>
<dcterms:isPartOf>Locke &amp; Key</dcterms:isPartOf>
<dc:publisher>Idea and Design Works, LLC</dc:publisher>
<dc:format>graphic novel</dc:format>
<dc:identifier>URN:ISBN:9781600108860</dc:identifier>
</qualifieddc>

Challenges in MARC format

I ran into some of the same challenges in MARC as in Dublin Core, and a few new ones.

  1. MARC allows one to include creative roles, but doesn’t seem to allow for co-equal authorship by writer and illustrator.
  2. The first indicator of the 245 field, specifying whether or not this was an “added entry,” didn’t make sense.
  3. I don’t have access to LC subject headings.

1) I used the 245 field to spell out the role each person played in the book (and called Joe Hill the “writer” vs. the “author,” based on the title page). I then put Joe Hill, as the more traditional author, in the 100 field. Based on the OCLC definition, it seemed like it should only be done for a primary author and I wasn’t sure whether or not it could be repeated. I put Gabriel Rodriguez in the 700 field and used |e to specify his role as illustrator (ill.). I decided that by not including the colorist & letterer in 700 fields, I was asserting the creative importance of Mr. Rodriguez.

2) Thanks to catalogers I asked on Twitter & via email, I learned that the first indicator is a leftover from the days of card catalogs, in which the primary entry would be under the author’s name. Thankfully, I’m from what was probably the last generation to get any practical card catalog experience, so this immediately made sense. I think it’s also an example of the legacy formatting which MARC needs to drop, or which is part of the reason it’s a good thing that libraries are developing new formats.

3) I discovered on OCLC that 655 is a genre term and decided to use it in place of 650. I made the subjects very broad—fantasy & horror— under fiction and then “graphic novels.” [Edited to add: I also learned from a cataloger at work that I should try authorities.loc.gov next time.]

Additionally, I had to generate my own 008 field in its entirety, which wouldn’t normally happen when using library software. I used online tutorials for understanding the 008, though I’m not sure it’s entirely correct. I took a small liberty in the 300 field and added “(extensively)” after ill. because, as a graphic novel, the piece is illustrated on each page.

Some advantages to MARC included the designated field for ISBNs (it took some Googling for examples to find the proper syntax for ISBN in Dublin Core), the inclusion of all creators in the 245 field as mentioned above, the series fields (although, I still need to talk to a cataloger about the difference between 490 and 830).

Locke & Key: Keys to the Kingdom — cataloged using MARC

008	__	111009s20112011caux
020	__	|a1600108865
020	__	|a9781600108860
100	1_	|aHill, Joe
245	10	|aLocke & Key : |bKeys to the Kingdom. / |cJoe Hill, writer ; Gabriel Rodriguez, illustrator ; Jay Fotos, colorist ; Robbie Robbins, letterer |nvol. 4
246	30	|aKeys to the Kingdom
260	__	|aSan Diego, CA : |bIdea and Design Works, |c2011
300	__	|a160 p. : |bill. (extensively), col. ; |c25.5 cm.
490	1_	|aLocke & Key ; |vv.4
500	__	|a"Originally published as Locke &amp; Key: Keys to the Kingdom issues #1-6."--t.p. verso.
505	0_	|aSparrow -- White -- February -- Casualties -- Detectives: Part 1 -- Detectives: Part 2
520	__	|a"The Locke children have grown accustomed to the myriad of magical keys discovered within the ancestral family home of Keyhouse. They have also grown accustomed to tragedy. What they may not be prepared for is just how closely danger stalks their every move as Lucas Caravaggio, alias Zach Wells, continues his relentless quest for the key to the black door." -- back cover.
655	04	|aHorror |vFiction.
655	04	|aFantasy |vFiction.
655	_0	|aGraphic novels.
700	1_	|aRodriguez, Gabriel, |eill.
830	_0	|aLocke & Key ; |vv.4

(Since my library uses Millenium and pipes, I used | instead of $ for delineation.)

Share on Twitter Share on Facebook Bookmark on Delicious Reddit! Share on Tumblr

Tagged as: LBSC 670, Organization of Information

Sometimes, during a class, I want to write about something it brought up that is in no way related to the assignments and probably not relevant in class discussion. What’s a girl who owns her own domain, used to work as a WordPress back-end consultant, and writes her own WordPress themes for kicks to do? Oh…right. This.

Right now, I’m taking two classes: Intro to Archives (LBSC 605) and the Organization of Information (LBSC 670). Let’s see what happens.

Share on Twitter Share on Facebook Bookmark on Delicious Reddit! Share on Tumblr