Main | April 2008 »

March 13, 2008

Digital Libraries continued

“The information revolution not only supplies the technological horsepower that drive digital libraries, but fuels an unprecedented demand for storing, organizing, and accessing information—a demand which is, for better or worse, economically driven rather than curiosity driven…” pp10

Economy is indeed important. Many people I encounter who request material be put into Digital Case assume that the most important step (and most costly) is the scanning of the items. This, surely, is the bottleneck. True, depending on what needs doing, it can be a crucial and costly step; but the reality is that scanning is only one small part of the entire process. If the collection to be scanned is vast, there must be a selection process (time = money); there is the scanning process, as mentioned already (if the materials are 8.5x11 paper of good quality, a sheetfed scanner can be used—but if the materials are hand written, fragile, or not paper at all: slides, photographs, etc., they must be scanned individually (time = money); optical character recognition (ocr)? (time = money); then they must be saved with a file naming convention to be recognized; then metadata must be supplied for each item (time = money); then they are uploaded into a system—this can be batch processed or done one at a time; then there are the storage costs, the network costs, the interface design costs, the migration costs, and so on.

“If information is the currency of the knowledge economy, digital libraries will be the banks where it is invested.” (I understand the point, but take issue with the exclusivity of the statement—as databases are by far the bigger banks.)

H.G. Wells was promoting the notion of a “world brain.” I have often thought about this, too. Only in the context of a network of computers, very like the various neurons of our brain connected through dendrites and synapses—the great web of computers unifying into one gigantic Borg.

“Vannevar Bush, the highest-ranking science advisor in the U.S. war effort, urged us to ‘consider a future device for individual use, which is a sort of mechanized private file and library…a device in which an individual stores all his books, records, and communications, and which is mechanized so that it may be consulted with exceeding speed and flexibility.” Pp16 (Memex—Bush’s automated library)

Argument for Digital Humanities by Licklider: http://www.ibiblio.org/pioneers/licklider.html

"About 85 per cent of my "thinking" time was spent getting into a position to think, to make a decision, to learn something I needed to know. Much more time went into finding or obtaining information than into digesting it. Hours went into the plotting of graphs, and other hours into instructing an assistant how to plot. When the graphs were finished, the relations were obvious at once, but the plotting had to be done in order to make them so. At one point, it was necessary to compare six experimental determinations of a function relating speech-intelligibility to speech-to-noise ratio. No two experimenters had used the same definition or measure of speech-to-noise ratio. Several hours of calculating were required to get the data into comparable form. When they were in comparable form, it took only a few seconds to determine what I needed to know.

"Throughout the period I examined, in short, my "thinking" time was devoted mainly to activities that were essentially clerical or mechanical: searching, calculating, plotting, transforming, determining the logical or dynamic consequences of a set of assumptions or hypotheses, preparing the way for a decision or an insight. Moreover, my choices of what to attempt and what not to attempt were determined to an embarrassingly great extent by considerations of clerical feasibility, not intellectual capability." (Licklider).

“Wells, Bush, Licklider, and other visionary thinkers were advocating something very close to what we might now call a virtual library. To paraphrase the dictionary definition, something is virtual if it exists in essence or effect though not in actual fact, form, or name. A virtual library is a library for all practical purposes, but a library without walls—or books.” Pp16

Witten and Bainbridge then make the observation that libraries have always used “virtual” collections, even right from the beginning, in that they used catalogs and indexes. After all, catalogs and indexes are representations of the thing, not the thing itself.

“A library catalog is a complete model that represents, in a predictable manner, the universe of books in the library. Catalogs provide a summary of, if not a surrogate for, library contents. Today we call this “metadata.” And it is highly valuable in its own right.

“The information in library catalogs and bibliographies can be divided into two kinds: the first having reference to the contents of books; the second treating their external character and the history of particular copies. Intellectually only the abstract content of a book—the information contained therein—seems important. But the strong visceral element of books cannot be neglected and is often cited as a reason why book collections will never become ‘virtual’.” Pp 17

Examples: the forest of steles, The Book of Kells (a masterpiece of Western art); “you might believe it was the work of an angel rather than a human being.” –Giraldus Cambrensis, scholar 13th century. (and in art we are aspiring to our god-like natures) pp18

“A picture of the cover may be displayed as a ‘tangible’—or at least memorable—emblem of the physical book itself. Users can browse the collection using graphical techniques of virtual reality. Maybe they will even be able to caress the virtual cover, smell the virtual pages…” pp20

Posted by twh7 at 07:29 PM | Comments (0) | TrackBack

March 10, 2008

Steles

Digital Libraries “are about new ways of dealing with knowledge: preserving, collecting, organizing, propagating, and accessing it—-not about deconstructing existing institutions and putting them inside an electronic box.” Witten and Bainbridge, p6

Digital Case stores, disseminates, and preserves the intellectual output of Case faculty, departments and research centers in digital formats (both “born digital” items as well as materials of historical interest that have been digitized). Kelvin Smith Library manages Digital Case on behalf of the university. With Digital Case, KSL assumes an active role in the scholarly communication process, providing expertise in the form of a set of services (metadata creation, secure environment, preservation over time) for access and distribution of the university’s collective intellectual product.

So, I have been suddenly overcome by the fact that a library created in the Song Dynasty in China has existed since 1100A.D. Well, not really this fact, as I’m actually disappointed that no library has continually existed longer than this (seems like there should be one somewhere) regardless, what has overcome me is that many of the items in the collection are stone slabs called “steles.” They are located in Xi’an “an ancient walled city in central China.” But what has most impressed me is the photograph of a man holding up a “rubbing” from a stele.

When I was a kid I would sometimes do a “rubbing” of a gravestone using paper and chalk. It never occurred to me that books could be handled this way. I am just struck thinking about how libraries today lend the actual item (a print, to be sure, but the actual item); and that we are now creating copies that can be downloaded and so on. And yet, here is an example that is over 900 years old where a patron could come in, get some paper—or whatever they used—and chalk—or whatever they used—and rub off a copy, roll it up, and then walk out. Each person could come in and take away any of the 2000 items in a sort of first shot at downloading a version.

And, as Witten and Bainbridge drolly note, “We think of the library as the epitome of a stable, solid, unchanging institution, and indeed the silent looming presence of 2,000 enormous stone slabs—often called the “forest of steles”—certainly projects a sense of permanence.

Posted by twh7 at 07:26 PM | Comments (0) | TrackBack

March 06, 2008

Digital Libraries

Digital Libraries “are about new ways of dealing with knowledge: preserving, collecting, organizing, propagating, and accessing it—-not about deconstructing existing institutions and putting them inside an electronic box.” Witten and Bainbridge, p6

Digital Case stores, disseminates, and preserves the intellectual output of Case faculty, departments and research centers in digital formats (both “born digital” items as well as materials of historical interest that have been digitized). Kelvin Smith Library manages Digital Case on behalf of the university. With Digital Case, KSL assumes an active role in the scholarly communication process, providing expertise in the form of a set of services (metadata creation, secure environment, preservation over time) for access and distribution of the university’s collective intellectual product.

Examples of digital libraries (from Witten and Bainbrdige):

1. Supporting Human Development: Kataayi—rural Uganda—community members have “built ferrocement rainwater catchment tanks, utilized renewable energy technologies (solar, wind, biogas), and established a local industry making clay roofing tiles.” All this was made available using the Humanity Development Library which sent 1,200 books on CD-ROM for them to use. As Witten and Bainbridge point out, these 1,200 books “would weigh 340 kg, cost $20,000, and occupy a small library bookstack.” The Kataayi community also uses related collections of materials on CD-ROM with such topics as disaster relief, agriculture, the environment, medicine and health, food and nutrition, etc.
2. Pushing the Frontiers of Science: “for the last decade physicists have been using automated archives to disseminate the results of their research.” In 1990, 200 physicists started the project with a few research questions and papers. By 2000, tens of thousands of physicists were using it and over 150,000 papers were in the archives with 150,000 requests handled daily. As Witten and Bainbridge note, “for some areas of physics, online archives have already become the dominant means of communicating research progress.” Pp3
3. Preserving a Traditional Culture. Worried that their culture and traditions were vanishing, the Zia Pueblo are creating a digital library that “will include an oral history compilation, with interviews of tribal elders conducted in their native language…an anthology of traditional songs, with audio recordings, musical scores transcribed from them, and lyrics translated by a native speaker…video recordings of tribal members performing Pueblo dances and ceremonies, along with a synopsis describing each ceremony and a transcription and translation of the recorded audio.” Pp4
4. Exploring Popular Music. Very like iTunes. That is what I thought, at least, initially when it was described as “a digital music library that reflects popular taste, a library that people from all walks of life will want to use.” But there are some singular features: the ability to submit tunes either by recording yourself humming or by MIDI keyboard—and have the library search based on the sound file (for all those people who, like the Billy Joel song, who know the song is “sad and its sweet and I knew it complete, when I wore a younger man’s clothes”); can search by lyrics, titles, authors/composer, etc.

As Witten and Bainbridge point out, these “four examples…hint at the immense range of digital libraries” and they are not, as it “often seem[s] scholarly and esoteric.” p5 I would here concur completely and point to Digital Case as the prime example, as it not only holds a diverse array of materials, but in the future will diversify even further. Right now there are image files, such as the WPA Prints collection; Audio files, such as the Center for Policy Studies lecture series; video files, such as the Freedman Center video collection; datasets, such as the work by Tim Beal in his consideration of pluralism in undergraduate students at Case; there are fulltext digital books; and much more.

Witten and Bainbridge then go about providing technical definitions of what libraries are, what digital libraries are, how they are different or mean different things to everyone, and so on. But I think the most important point they make is the point that leads off this article, namely that:

Digital Libraries “are about new ways of dealing with knowledge: preserving, collecting, organizing, propagating, and accessing it—not about deconstructing existing institutions and putting them inside an electronic box.”

This is one of the main goals of the Freedman Center’s Freedman Fellows program, which will soon be kicking off its fourth year of making significant connections between faculty and new technologies: specifically to introduce them to the new tools available and encourage new ways of seeing how scholarship and technology can interact.

With all this in mind, Witten and Bainbridge state that a “digital library is conceived as an organized collection of information… a focused collection of digital objects, including text, video, and audio, along with methods for access and retrieval, and for selection, organization, and maintenance of the collection.” p6

I guess the only thing I would add to this definition, at least explicitly (as it may be taken to have been implied) is “a focused collection of DESCRIBED digital objects…” to put the emphasis on the metadata element that is essential to a good digital collection. As I said, this may have been implied by “methods for access and retrieval” or even “organization” but, knowing what I know now the importance of metadata is almost as important as the object itself.

Witten and Bainbridge go on to mention the inclusion of such things as 3D objects, simulations, dynamic visualizations, and virtual reality worlds. Some of these, I think, go a bit far. At Case we consider some of these elements to be “learning objects” and leave them in the sphere of instructional technologies, which often are organized in a manner similar to digital libraries. Virtual worlds are entities in and of themselves and suggests to my mind the notion of saying, “our digital library includes other digital libraries,” but perhaps some do. Close to this is the notion that a virtual world is not necessarily a “digital object,” unless you want to consider it a massive digital object with extraordinarily complex properties. 3D objects are something that Digital Case is considering as well.

“Every collection should have a well-articulated purpose, which states the objectives it is intended to achieve, and a set of principles, which are the directives that will guide decisions on what should be included and—equally important—what should be excluded.” Pp7

As if to confirm the above paragraph, Witten and Bainbridge draw a firm distinction between a digital library and the World Wide Web in general:

“the Web lacks the essential features of selection and organization…what connects a new acquisition into the structure of a physical library is partly where it is placed upon the shelves, but more important is the information about it that is included in the library catalog” that is, the metadata.”

This is, of course, not to say that there are no websites out in dubdub land that are not carefully organized and that provide excellent selection criteria, but on the whole, the Web is a chaos where very little critical thought is applied to the content. I do not agree with Witten and Bainbridge in their other characterization that the ease by which material can be added is a significant determinant of whether something is or is not—nor that the requirement of “manual updating the structures used for access and retrieval” adversely impact the categorization either: after all, applying this definition would exclude all the years in which card catalogs had to be painfully (manually) updated, created, and installed as a part of the regular system of library functions.

I will continue this discussion later.

Posted by twh7 at 07:24 PM | Comments (0) | TrackBack