Digital Preservation

Digital preservation faces challenges on many fronts, these being compounded for smaller independent collections such as UbuWeb. For one, it appears fundamental to the nature of the medium that no standard may appear, and that data/format migration or emulation will be the future of all archival endeavors, as will the need for ongoing IT mediation (and related costs). Also, negotiating copyright issues for digital objects in an era of easy and lossless duplication adds a layer of complication and expense. As mentioned previously, UbuWeb combines elements of both digital collections and webpages. Consequently, its digital preservation could take two main paths, each offering different opportunities and threats for.

Archiving UbuWeb as Digital Collection

One option for archiving UbuWeb is to move the text and media files from their existing structure into one of the available open source software options for digital collections. This option maximizes access and browsability (which was actually one of the initial goals of this project). However it also means that its ultimate preservation fate is ambiguous. On the one hand, it’s beneficial to adopt the standards and formats of other collections because in theory you can copy their preservation efforts (for example, LOCKSS). On the other hand, digital preservation remains a question mark for even the most prominent of collections. Will practical standards emerge, or are digital objects doomed to a kind of quasi-existence, requiring constant work to ensure their identity?

The increased access of this method (most free software are compliant with protocols of the Open Archives Initiative) also means that copyright questions may have to be formally resolved (requiring resources necessary to demonstrate sufficient investigation into rightsholders of thousands of potentially orphaned works). This is because if they remain in question, then it is possible that increased access/awareness could lead to increased legal threats, potentially limiting UbuWeb’s freedom (whether a little or a lot).

This method also requires choosing what constitutes an item (does curatorial commentary count as text or not?) as well as a lot of time and effort creating and formatting metadata. This is not to mention the time/effort spent writing grant proposals to accomplish all of this (pursuing rights, creating metadata, and sheer copy/paste).

Probably the best open source option currently available for the creation and maintenance of digital collections is Omeka. Heavily influenced by the WordPress interface/model, it is attractive as well as easy to use and customize. Minimal database administration is required. It can also accommodate a variety of filetypes. Other contenders for preservation of digital collections include DSpace and the Archivist’s Toolkit. The former has proved excessively clunky and resource-intensive, and the latter requires hosting.

Archiving UbuWeb as Website

Because the structure of UbuWeb is already more similar to that of a website, an option is to preserve it as a freestanding object. This means that site navigation and access would remain more or less the same, however it requires less initial investment because no conversion efforts are necessary.

Copyright clarification may or may not be an issue for web archiving. While much of the contents of UbuWeb are out-of-print or orphaned, it could be argued that the contribution of the works as just small parts of the overall impact of UbuWeb constitutes non-derivative, transformative use (when fair use may not apply). There is also precedent for harvesting and preserving webpages independent of content rightsholder. For example, the government of New Zealand recently performed a harvest of all .nz sites without consulting rightsholders (and ignoring general robots.txt exclusion). Similarly, the Hoard.It project has harvested object data from a number of museum sites without seeking permission nor making apologies. While not the letter of the law, it seems that these kinds of incursions on web content for archival purposes seem intuitively more acceptable than those related to individual objects/files.

Basic web archiving is available to almost anyone, though these kinds of applications may require a screen-by-screen harvest, which would not be possible with a site the size of UbuWeb. Fully fledged web archiving options include the Web Curator Tool (WCT) and Archive-It. However, these will most likely require more IT resources as both use applications beyond the basic MySql-Apache-PHP install available from conventional hosting.

Longterm

The longterm preservation prospects for UbuWeb are similar from a digital collections and a web archiving point of view. On the one hand, it’s possible that more legit digital versions of UbuWeb could be preserved for the longrun by an institution (though it is unclear whether the benefits of this would outweigh the seeming violation of the entire ethos of the site). Preserving UbuWeb independently, however, whether by migration or emulation, requires significant longterm effort and planning, which is difficult to guarantee without institutional support. It’s also potentially more tricky to depend on longterm efforts and resources, as the site tends to focus on content that is potentially “difficult” for the casual reader (its outsider nature may make it less desirable to fund? though this niche appeal could make it more valuable, it’s tough to say).

Digital Preservation

About this site