Cataloguing beyond the walls: APLA 1997 |
|
Not too surprisingly, there is much of interest to researchers and others among our users who are desirous of finding timely information or useful contacts in almost any subject, no matter how esoteric. Just as it is with the print resources that libraries collect, the problem with electronic resources has been as much one of quantity as of quality. Our role as librarians is to select from this universe just those available electronic resources which match the needs of our user population and at a cost commensurate with the value of the information to the institution as a whole. Once this information is identified and purchased (if indeed a cost is even involved), it should be within the mandate of the library to provide discovery aids at a level appropriate to the value placed on the information to begin with.
In a recent issue of Library Technology Reports David Barber defines four types of digital library data: images, text, geographic data, and numeric data. (Barber 1996) These categories are not necessarily equal in terms of quantity of resources. Text, for example, makes up 90% of the content of the Web.
For the purposes of this paper I have divided up the Internet universe into seven convenient categories of information which have potential interest for libraries. These are:
While not an exhaustive list of possibilities, these categories should be sufficient to our purpose here, which is to demonstrate that the Internet does indeed provide access to resources as valuable as those "traditional" resources which we are selecting for adding to our collections now.
We are all aware of network resources like email, listservs, newsgroups, and chat sites, tools some of us use everyday in our work and, dare I say it, our play. Many of these are not designed for public consumption and are of limited use beyond the clientele they already serve, however well. Some, like newsgroups and listservs, which may have information of potential value to those outside of their pool of subscribers, are ephemeral by nature and must rely on other technologies, like Web or gopher sites and electronic archives, to present a more public front-end. These sites consist of primary documentation and are mainly of interest to researchers in a given discipline who are interested in establishing what a particular user population is or was thinking about at a given point in time. As an example in our own discipline, the Web4Lib list maintains an extremely well-organized Website with subscription and policy information and an archive which allows for searches by subject, poster or date. Once relevant information has been found, hypertext links are provided to enable the researcher to move to the next article in a subject thread, on a particular date or by an author.
Since arguably most librarians, and probably most library patrons as well, think of the library as a collection primarily of books, it shouldn't be too surprising to find that some of the earliest digitization projects involved monograph literature. Project Gutenberg is probably one of the best known of these since it has been ongoing as a largely volunteer-based digitization project since 1971. Michael Hart's vision was to see as much of the world's public domain literature in electronic form as possible and in a format that any level of computer from a Commodore PET or VT 52 to an IBM RS6000 or even Cray supercomputer could access. Hart chose plain ASCII text as the vehicle for his texts and opened the doors to volunteers from around the world to contribute to the creation of an electronic library. While not particularly esthetically pleasing or easy to read, and certainly not as technologically sophisticated as the later SGML versions of etexts, Gutenberg texts have provided a good basis for much of the later work in electronic text communication. An example is Edith Wharton's The age of innocence.
More technically ambitious digitization projects have made use of advances in presentation technology, like digital imaging, SGML, hypertext, and automatic indexing, made since the heyday of Project Gutenberg. Starting with projects like Hunter Monroe's largely Gopher-based Alex catalogue and moving more fully into Standardized General Markup Language (SGML) text in the University of Virginia's Electronic Text Center, Columbia University's Project Bartleby, and Oxford University's Oxford Text Archive, digital text has come to represent something of more scholarly interest to libraries and their patrons. An example of "second generation" electronic text may be seen with Mary Wolstonecraft's A vindication of the rights of women at the Project Bartleby site. Here one finds more readable text, combined with limited images, and indexing at least at the chapter level.
While digitization projects have enabled worldwide access to texts of varying readability through the medium of the Internet, they have not necessarily used this medium to fullest advantage by adding value to the bare text. Obviously, adding value to text is a scholarly activity in itself, demanding a great deal of time and effort, something more likely to be taken on by individuals rather than the corporate entities underwriting the first two mentioned projects. A good example of some of the possibilities presented by the medium is H. Churchyard's hypertext version of Pride and prejudice by Jane Austen, mounted on the University of Texas server. Here the text itself may be retrieved in a plain text version for qualitative or quantitative analysis, but more useful are Churchyard's annotations, genealogical tables, character sketches, and links to illustrations and other Austen sites.
Finally, no discussion of electronic texts would be complete without mention of the considerable effort being made in this area by governments at all levels. This has been a mixed blessing for libraries collecting in this area. On the positive side, the considerable cost of acquiring, claiming, shipping, storing, and accessing government information is saved on both sides of the information provider/user divide. Information that might have taken several months to order, ship, and catalogue, can now be made available as soon as it is ready for public distribution via the Internet. Since access to the online version is not restricted to the single use restrictions of paper, an unlimited number of "copies" may be made available at the point of greatest use, e.g. closest to publication. On the other hand, governments are often quick to make changes to their Web server configurations, leaving users searching in vain for missing resources. Governments are also often guilty of removing information which they see as no longer serving their original, rather short-range, goals. Frequently this information is provided in formats which demand a fair bit of network bandwidth and enduser system resources, such as Adobe Acrobat's .pdf or a specific word processor's internal format. Yet it is likely that this use of the Internet will grow rapidly over the next few years.
Mention should also be made to recent projects to digitize other kinds of university electronic texts of interest to their libraries, namely theses and reserve materials. Solinet's Monticello Electronic Library is one such project dealing with dissertations, along with other formats, which will then be shared within a consortium of Southeastern U.S. Libraries. Electronic reserves, an idea with considerable attraction for libraries where demand for reserve materials causes an inordinate strain on circulation staff as well as students, has been implemented at the University of Wisconsin's Steenbock Library as well as others.
Early attempts at creating electronic journals used the technology of the time to distribute "issues" over the net. Many of these used email and electronic lists as the medium of choice for communication with subscribers due to the absence of any standard client application capable of rendering files thus created, particularly the graphics, mathematical formulae, and other features that gave journals much of their usefulness and appeal. OCLC broke the ice to a degree with its Guidon interface, used to provide access to its ground-breaking Online journal of current clinical trials in 1992, but it was the widespread acceptance of the World Wide Web which occured when NCSA began distributing the Mosaic graphic browser in 1993-94, that really got the attention of journal publishers. Even major publishers like Elsevier, whose Tulip Project was operating parallel to OCLC's Electronic Journals Online in 1992, eventually joined OCLC on the Web bandwagon by 1994.
Today, HTML has become the preferred vehicle for the publication of thousands of electronic journals, with more coming online each day. Many, like the Journal of the American Mathematical Society and Science online, are simply online versions of the paper edition, with hypertext links between tables of contents and individual articles. Others diverge significantly from their paper version, adding hypertext indexes, three dimensional animated graphics, sound files, and interactive programs to the more traditional text and 2D images. Newspapers are also getting in on the act, with everything from the Kathmandu Post to the local rag online.
The viability of electronic journals is "guaranteed" in several ways: by paid subscription, distribution of passwords to paper edition subscribers, pay-by-the-article, commercial advertisements, institutional funding, and even volunteer efforts funded by individual Webmeisters. Since use, and therefore cashflow, is usually stronger immediately following publication and tends to fall off sharply over time, commercial publishers have had difficulty justifying the rather huge cost of archiving back issues in relation to their likely use. For this reason, various institutions and consortiums have created archives of these back issues, either as part of the mandate of the library (see the National Library of Canada's Electronic Collection for example), or as a cost-sharing venture (see JSTOR or the CIC Electronic Journals Collection for examples of this).
While institutions such as the Library of Congress have been digitizing images for internal use for some time, it is only in the past few years that these have been made available on the Internet. Their absence from the net has largely been due to the size of the files themselves and the inability of early browsers to handle them. Following on the lead of LC, along with GIS (Geographic Information System) developers and art museums (check out the Louvre online), libraries themselves are increasingly getting into the digital image business. Columbia University's Digital Image Access Project is one example of this. Here is an image of the Empire State Building from that collection.
Examples of other image files of interest to library users and scholars include almost any of the images available from the main page at the Library of Congress. For example, they currently are exhibiting a selection of turn-of-the century baseball cards, as well as images from their motion picture collection. Esquimaux village from the Buffalo World's Fair, 1901 is an example of this and may be downloaded as moving images as well as stills.
No discussion of image files would be complete without mentioning the substantial number of GIS sites available worldwide. These cover the entire range of print map collections from digitized aerial photos and satellite images to street maps of your own neighbourhood. Frequently the collections include indexes by place name, coordinates, or even by guide maps. A good example of a site combining many of these may be found at the U.S. Geological Survey's Global Land Information System site.
Electronic databases and other finding aids
This category is a catch-all for a number of primarily reference tools which commonly appear on library Web pages and is not terribly distinguishable from the next two categories at times. Examples include indexes and tables of contents to specific periodicals such as are frequently found on electronic journal sites, dictionaries such as Merriam-Webster's WWWebster or Roget's Internet Thesaurus, databases like the Lawyer Locator and gazetteers such as the National Research Council's Canadian Geographical Names. One such source of particular interest in Newfoundland is the Web version of the List of lights, buoys and fog signals Newfoundland.
One of the more celebrated sites on the Internet has to be the National Library of Medicine's Visible Human Project, a project to digitize cross-sectional MRI images of the human body every 1 centimeter from head to toe. Probably more amazing is that the Web abounds with other sites that exploit the medium just as well as the NLM. For example, CNN Interactive goes beyond TV and the electronic newspaper to present a smorgasbord of news, sports, weather (including digital satellite imagery), travel, and health information updated in real time. Canadians may also want to check out Canoe, which like CNN Interactive combines news, weather, and sports, with ejournals like Maclean's, Chatelaine, and the Financial Post. Epicurious combines a search engine for recipes with ejournals (Gourmet, Bon Apetit) and travel information all in one site.
Academic and school libraries will probably get a lot of mileage out of sites like Peterson's Education & Career Center, a site combining information on both private and public grade schools with data on undergraduate and graduate-level programs, job placement, and education financing.
Websites acting as gateways to other resources
Probably reflective of the origin of much of the Web in the transformation of Gopherspace, many WWW sites remain as collections of links to further sites. While most of these are probably not of much utility to libraries, many have assumed the role of Internet bibliographies or directories, directing the user to electronic resources which are of interest in a particular discipline, format or subject area. One of the most common examples is the official government home page, which consists of links to government agencies, publications, personnel, and programs. Examples of these can be found at Canada Site, the Government of Newfoundland and Labrador page, and those for Nova Scotia, New Brunswick, and PEI. While governments themselves usually maintain their own sites, the International Court of Justice website is actually maintained at Cornell University.
Also included in this category are the Internet search engines such as
Yahoo, Alta
Vista, and Canada's own Maple Square.
| Table of contents. | URL: http://www.mun.ca/library/cat/catnet/WhatsOutThere.htm Last revised: 21-May-1997 22:36 NST Document author: Charley Pennell |