Subject: Center for Electronic Texts in the Humanities
From: Stephen Ferguson <0629212@PUCC.BITNET>
Date: Fri, 3 Jan 1992 09:18:20 EST
Message-id: <"UFovq1.0.jn2.EMBCn"@sul2>
Sender: Rare Books and Special Collections Forum <EXLIBRIS@RUTVM1.BITNET>
The following news may interest a number in our group. It comes from
the Director of the Center, Susan Hockey.
------------------------------------------------------------
Center for Electronic Texts in the Humanities
Susan Hockey, Director
Rutgers and Princeton Universities have recently established a Center for
Electronic Texts in the Humanities with external support from the Andrew W
Mellon Foundation and the National Endowment for the Humanities. The Center
is intended to become a national focus of interest for those who are
involved in the creation, dissemination and use of electronic texts in the
humanities, and it will act as a national node on an international network
of centers and projects which are actively involved in the handling of
electronic texts.
The Center is guided by an Advisory Board consisting of outstanding
humanities scholars, information professionals, publishers, and computer
scientists who meet once per year to develop priorities for the Center's
activities. Using the Inter-University Consortium for Political and Social
Research as a model, the Center plans to expand its activities to include
membership by other institutions. The operations of the Center are divided
between Rutgers and Princeton Universities, with the administrative
headquarters at Rutgers in the Alexander Library, and the computing
operations mainly at Princeton.
The Center has developed from the international inventory of
machine-readable texts which was begun at Rutgers in 1983. The inventory is
held in RLIN and the Center is giving priority to the development of the
inventory during its first year of operation. In many cases, information
about machine-readable texts, such as the exact source material, encoding
scheme, revision history is very sketchy. The Center will review the
records which are already in the inventory and attempt to enhance them
where possible. Texts which are available commercially will have priority
for cataloguing. The Center is also collaborating in the survey of
machine-readable texts organized by Professor Antonio Zampolli and Dr
Donald Walker on behalf of all the major text analysis computing
organizations, and will work on the survey sections on corpora, text
collections and individual texts.
A second activity of the Center will be the acquisition and dissemination
of text files to the community. Our present plans are to concentrate on a
selection of good quality texts which can be made available over Internet
via suitable retrieval software and with appropriate copyright permissions.
All texts which the Center holds will be encoded according to the
Guidelines of the Text Encoding Initiative (TEI), an international project
sponsored by the Association for Computers and the Humanities (ACH), the
Association for Computational Linguistics (ACL) and the Association for
Literary and Linguistic Computing (ALLC). The TEI has developed a tag set
using SGML (the Standard Generalized Markup Language) which is applicable
for many different types of humanities texts. It also includes guidelines
on the documentation of machine-readable text files using a TEI header
which can be used for cataloguing information.
A third activity of the Center will be the provision of educational
programs for humanities computing and methodologies for research and
instruction using electronic texts. The first will take place on August
9-21, 1992 at Princeton where the tutors will be Dr Willard McCarty of the
Centre for Computing in the Humanities, University of Toronto and Susan
Hockey. This seminar is intended for researchers and librarians who have
some basic computing experience, e.g. word processing and electronic mail
but little or no experience of computers in a research environment. It will
cover topics such as text encoding, methods of text acquisition,
concordances, text retrieval, preparing critical editions and hypertext
with practical work using the software such as TACT and Micro-OCP. The
seminar will look at the current generation of software tools and then go
on to examine what is needed to make these tools better for research
applications in the humanities. Techniques in morphological analysis,
syntactic analysis and parsing methodologies are being developed in
computational linguistics and natural language understanding research and
these may be applied to humanities texts to facilitate retrieval,
particularly when they are used in conjunction with a lexical database or
machine-dictionary.
The Center will also act as a clearinghouse on information related to
electronic texts and will direct enquirers to other sources of information,
for example the Catalogue of Projects in Electronic Texts compiled by the
Georgetown University Center for Text and Technology and the Text Archive
held at Oxford University Computing Service. A regular newsletter will be
produced and the information in it made available electronically via a
bulletin board.
An important role for the Center will also be the participation in
conferences and workshops which are concerned with electronic resources and
tools for manipulating them. The Center will be represented at the workshop
of the Consortium for Lexical Research to be held in New Mexico in January,
where a model for consortial development of lexical resources will be
discussed. It will also participate in the next joint annual conference of
the ALLC and ACH at Christ Church, Oxford, England on April 5-9, 1992 where
new methodologies for the use of electronic texts will be presented. The
Project Director and Center Director are actively involved in the Text
Encoding Initiative and the Center expects to play a leading role in the
testing, evaluation and dissemination of the TEI Guidelines.
Electronic texts are still in their infancy compare with the amount of time
that printed materials have been available. We do not really know how to
handle them in the general sense, how to preserve them, or how to maintain
them in a usable form whilst keeping up with new developments in the
techonology. In the longer term the Center plans to collaborate with
centers and projects in Europe, Japan and elsewhere to conduct a
feasibility study to establish ground rules for handling electronic texts,
and then to establish mechanisms which can be used by all who have an
interest in such material.
For further information about the Center, please contact
Center for Electronic Texts in the Humanities
169 College Ave,
New Brunswick
NJ 08903
phone: (908) 932-1384
fax: (908) 932-1386
electronic mail: ceth@zodiac.rutgers.edu