« DigCCurr, Chapel Hill, April 1-3, 2009 | Main | Non-Linear Thinking and New Media Literacy »
April 03, 2009
DigCCurr, Chapel Hill, April 1-3, 2009, continued
Day 2
Session 4: Cooperative Approaches to Digital Preservation
Matrin Halbert (Emory University); Tyler Walters (Georgia Institute of Technology); Aaron Trehub (Auburn University); Richard Pearce-Moses (Arizona State Library); Jonathan Crabtree (UNC SLIS faculty)
Four archives were presented and discussed as case studies in this panel- MetaArchive Cooperative; Alabama Digital Preservation Network; Data-PASS and the Persistent Digital Archives and Library System (or PeDALS, also covered in a previous session). The LOCKSS alliance is used in some- distributing data and content between members (to geographically spread the files, as well as providing access to other institutions with membership). LOCKSS is set for both public access and private, as determined by the institution (whether for copyright issues or for private information- such as personal info on archival record, for instance).
Matrin Halbert covered the MetaArchive in a brief recap of its 6 year history. MetaArchive is a cooperative effort, with a fee membership. Institutional liability, incorporation strategies and cost models were studied and applied in the early development phase. They also developed a set of tools based on the needs of the members that worked on top of the existing LOCKSS framework that included shared 'discovery portals'.
Aaron Trehub covered the Alabama Digital Preservation network which combines the digital assets of academic institutions, state agencies and cultural organizations. Each member is responsible for maintaining their own network, upstart fees and the hardware/software required for the network. Currently, ADPN does not charge members, but only require participate in contribution and hosting copies of certain portions of other digital collections. They have two committees for ADPN issues, one covering policy and the other covering the technical aspects. These committee members represent all the members, and also rotate the responsibility.
Data-PASS- Social science emphasis; Syndicated storage platform; Emphasized the scalability and asymmetric qualities of Data-PASS
Session 5- Gaps and Persistent Challenges
Donald Sawyer (VIE, Inc.) spoke about the challenges of past migration and transformation of data in the past, mainly pertaining to his many years experience at the National Space Science Data Center (NSSDC). (Article from the mid-90s talking about their Digital Linear Tape Jukebox) Sawyer maintains that loss is in some ways inevitable, whether due to organizational, hardware/software inefficiencies or just human error. Automated checks are necessary, but should not be exclusive in storage procedures and should include independent, manual operations as well. Sawyer spoke to digital loss, and commented that there is loss in physical documents as well, but just perhaps more possibilities to create systems to retain digital content much longer than the physical, though only with a solid, active digital preservation program in place.
Kevin Ashley (University of London Computer Centre) spoke about digital authenticity by way of bit stream preservation, but also in the explicit use of preservation and administrative metadata (or, 'putting the tangible on the intangible'). Questions of preservation planning for newer formats and the inclusion of all types of digital content (blogs, websites, virtual worlds) were raised, with long term flexibility in incorporating new media and also perhaps looking to larger digital collections. Mentioned David Rosenthal's work on Digital Preservation (article from 2005 on digital preservation systems)
Digital Curation Research
William Underwood (Research on Speech Acts and Electronic Records): Analyzing the writing of archival description to provide a method of automatic recognition for record collections. Underwood used a collection of Presidential records as a base collection to analyze this research and identified over 200 'speech acts' and creating a method to categorize and provide a search mechanism for the collection in a different manner of retrieval.
Bernadette Callery (University of Pittsburgh) covered a recent project for graduate students at Pitt to reconstruct lost web pages using the Internet Archive. (More info about project here. Questions arose about the authenticity of digital content, broken links and other lost web pages, and digital reconstruction efforts. Callery likened the work to art restoration, in distinguishing the reconstructions from the new content.
Leslie Johnston (Library of Congress, Strategic Initiatives) Current work at Library of Congress to fully define the modular services (transfer, transport and inventory) in a more meaningful way into the day to day production and workflow of the office. Development of an Inventory Tool splits content into metadata/access files and storage (in cases of images; the JP2 for example would be split from the tiff here). jBPM XML workflow tools are being used by LC; also BIL Java Library. A file package format (BagIT) is in test, using a minimal identification and description requirements (focus is on a reassurance on a full digital transfer, not speaking to the content of material). With the enormity of the incoming data collections, the Library of Congress had to quickly find a way to correlate digital content from multiple contributors, while also maintaining an inventory across multiple storage areas.
Digital Curation Tools and Strategies
David Giaretta (Preservation Workflows, Strategies and Infrastructure)
PARSE.insight is one of the project Giaretta mentioned in his presentation, with a published 'roadmap' of the group that covers a number of both technical and non-technical aspects of the work.
Robin Rice (EDINA and Data Library, University of Edinburgh) Data Information Specialists Committee - UK (DISC-UK) formed in the need for a common ground to share experiences between UK institutions and discuss work models, workflows, and tools/technologies for digital projects. One of the more interesting applications and research is in the Web 2.0 environment (recent article from Stuart MacDonald)
Mike Smorul (University of Maryland)- Audit Control Environment (ACE)- a way to validate the integrity of digital information by way of mathematical technique. The method in which the ACE works is in a series of ways for someone to check digital contents, in both centralized and distributed environments. It also creates an efficient method to correctly and consistently maintain the integrity of data over time. ACE creates a solidified tree root directory, greatly decreasing the time required in future validation processes to ascertain the integrity of files. It can also detect duplications and located renamed files.
Information about the upcoming DigCCurr Professional Institute:Curation Practices for the Digital Object Lifecycle, June 21-26, 2009 and Jan. 6-7, 2010
Posted by vad17 at April 3, 2009 12:04 PM
Trackback Pings
TrackBack URL for this entry:
http://blog.case.edu/digitallibrary/mt-tb.cgi/20137