GenWeb:A WWW Genealogy Proposal v. 2.0
19 June 1994
Contents
Basic Elements of the WWW system
Applying the Basic Elements to Genealogy
- The Genealogy Document
- Implications of GenWeb
- Research in Sources and Indexes
- Converting Present-day Systems
- Linking to other Services
- Security and Privacy
- GenWeb Demonstration
With the advent of the World-Wide Web concept on the Internet, we finally
have the tools to easily create a coordinated, interlinked, distributed, global
genealogy database called GenWeb. This document describes how GenWeb might be put
together very simply using hypertext documents residing on servers connected by the Internet.
This proposal is subject to change and, as it evolves, this document could
become the kernel of a formal Internet RFC covering this subject.
A WWW document is a plain text file that includes special codes to describe
how it should be displayed and how it may be linked to other documents. These
codes follow a standard called HyperText Markup Language or HTML. A WWW
document resides on a WWW server which publishes or transmits its files
over the Internet to whomever requests them. The codes in the WWW documents
allow a WWW client or browser application to display the document with
appropriate visual formatting. A WWW document may also contain codes
that direct the reader to other documents on the same server or on different
servers. These other documents may be text files or they may be picture, sound,
or video files.
A WWW client, sometimes called a "browser" is a computer program which
requests a document from a WWW server, and presents what it receives to its operator. Often
auxiliary or helper programs assist the client in presenting certain kinds of
documents such as sound, picture, or video files. The most popular WWW client is
NCSA Mosaic which comes in versions for Unix workstations running X-Windows, personal computers using Microsoft Windows, and
Macintosh pesonal computer. Computers without graphical display capability can use test-only clients, such as Lynx, to browse WWW documents.
A WWW server is merely a relay system, receiving requests for WWW documents
from clients and sending out the documents requested. WWW servers appeared first
on UNIX hosts, but versions for Macintosh and other operating systems are now
available. More sophisticated servers pass document requests to database search
routines which return a WWW-formatted file that the server then transmits to
the requesting client. In the future, WWW servers may need to conduct financial
transactions relating to the services it provides (i.e. pay per view).
WWW is made possible because of the ubiquity of the Internet, a mega-network
of computer networks, the forerunner of the anticipated National Information
Infrastructure, the formal name for the emerging data superhighway.
Transmissions between WWW elements observe HyperText Transmission Protocols
(HTTP), which themselves observe Transmission Control Protocol/Internet
Protocol, or TCP/IP. WWW services are generally available full-time because they
reside on a network of leased lines rather than dial-up connections. Some of the
links of the Internet carry traffic at very high speeds. Requesting a WWW
document may result in a transmission of as few as 300 bytes or one of several
megabytes. Internet users (or their employers) traditionally do not pay by
usage, but rather on a flat-rate, monthly basis. In the future, payment may
become usage sensitive as Internet service
becomes more generally available at the consumer level.
Because the WWW architecture is document-based, each individual database
record can be represented by a single document. For a genealogy database, this means that the record of an individual person can be contained within a
single document that describes the individual's basic vital information and that
individual's relationship to parents, spouse, and children. Using HTML, these references to other individuals become hypertext links to other
documents which reside either on the same WWW server or any other WWW server in
the world. A user seeking genealogy information through WWW would request the
document representing an individual and then read it on the computer display. If
desired, the user may activate the links to see pictures, hear sounds, or
experience a video clip of the individual on the screen. The user may follow
links to other individuals in order to climb the family tree or follow
relationships where they might lead, individual by individual, document by
document. Using this concept, the need to print out large pedigrees on paper
will diminish and --we won't be cutting down living trees to print family
trees.
Up until now, genealogy researchers
have tended to accumulate data from all sources into their own possession in
order to create their own private database representing the family tree. Under
WWW, private accumulation is not required because instant access to the entire
family tree is possible at any time. The data resides "in the network" and the
reader need not be concerned about where it physically exists or on whose hard
drive or whose computer. Possession of genealogy data loses meaning when
everybody can access everything at any time. There will naturally be a variety
of large and small servers on the network, all contributing their part to the
worldwide database. Each genealogy archive must be available ideally full time
for the system to be effective and can be sponsored by institutions or
individuals. The data may be stored in any format, such as text files or
database formats, as long as it is served up in the WWW format. Likewise, the
standard WWW client application can display a genealogy file adequately, but
specialized readers may be developed to create specialty displays. The actual
data structure of a genealogy HTML document remains to be defined, but I have
created several examples of how this can be done. Click here to see a demonstration genealogy
display.
Under GenWeb, basic genealogy research will still be necessary. The
existence and details of individuals must still be established using generally
accepted research standards. However, the accelerating effort to computerize
source documents which can then be published on a WWW server will enable each
GenWeb record of an individual to be linked to its sources. Using these links, a
user can re-trace the original researcher's path by immediately displaying a
computer version of the original record. Searching these computerized source
documents and GenWeb documents themselves will be possible following the
creation of indexes on each GenWeb server. Computer programs can be written
which would automatically search out the possible parental links of an
individual no matter where the parents' GenWeb record might be stored. A
researcher's role would then change from seeker of sources to evaluator of found
sources.
The
potential of GenWeb is to completely supersede personal genealogy
database programs, central genealogy database programs (Ancestral File), and
genealogy data communication protocols (i.e. GEDCOM). Data stored in
present-day systems could be exported into GenWeb file structure or could be
retained in its present form and served up by a translator program on an
as-needed basis. There will be a need for computer applications to perform both
tasks. Also, until the Internet is universally available, the role played other
computer networks will be valuable.
Other services are coming on line on WWW, including map servers, locality and
tourist information, and other specialty databases. It is relatively easy for
GenWeb documents to contain links to these services to provide background
information to expand on the basic document. To see this in action, see the
GenWeb .
Some have suggested that using WWW to distributed genealogy
information would allow anyone to snoop into individuals' private files.
This would not be the case for the following reasons:
- Only WWW servers can send out information. Mosaic or other WWW
browsers do not transmit files, but only receive them from WWW servers. Servers
have various mechanisms to be sure that only files that have been put into the
server's "path" can be served to clients. This preserves security of server
files.
- Files representing living persons would be retained locally.
Generally, GenWeb browsers would maintain files on their immediate
family and a few generations back on their local computer. Actually,
genealogical interest in living persons is limited to immediate family
anyway. These files may contain HTTP links to other files (parent or
siblings) that may reside locally or may reside on a remote WWW server. By
following these links, the user can seamlessly build a pedigree chart on
their screen.
- Links from ancestors to descendants are generally too difficult to maintain.
The custodians of GenWeb archives will have their hands full
certifying the parentage or upward links of the files in their WWW server.
They will not be interested in maintaining the downward links to
descendants maintained on other servers because their number will be too
large.
- GenWeb resembles but differs from LDS Ancestral File. The
GenWeb concept differs from the LDS Ancestral File practice of tracing
descendants down to living persons, whose identities must then be
protected. But it does correspond to the LDS "Four Generation" concept
whereby LDS members are responsible for establishing four generations of
ancestry. Beyond those four generations, the genealogical community at
large maintains the database. In GenWeb practice, various "custodians"
will maintain portions of the distributed database on a voluteer basis and
will freely publish their information on their WWW servers to anyone on
the network. The LDS Ancestral File is centrally maintained and its
distribution is tightly controlled.
- The demonstration is only a prototype. In my GenWeb demonstration,
I include information on living persons only for demonstration purposes.
In actual implementation, you would view HTML files on my family only if I
privately (or securely) distribute them to you. However, anyone sharing a
more distant ancestor would be allowed to link to that ancestor's HTML
file on my WWW server through the network.
- Privacy will remain a concern. All GenWebers, both browsers and
custodians, will need to remain alert to breaches of privacy and move to
close any gaps in the system, should they occur. I believe that the
genealogical community will be very vigilant in this matter.
To illustrate the GenWeb concept, I have prepared a demonstration accessible
by means of a WWW client. This demonstration includes grey-scale and color
photos, biographies, pedigree charts, and descendant charts linked to the
personal page for my grandfather, Wallace
Jones. I have recorded an audio file but plan to edit it down from 350
Kbytes to around 50 KBytes. I also plan to add a video clip soon. The basic
demonstration file and its supporting files reside on a single server in
California. But there are links to files on other servers. Some locality
information is linked to to a geographical database that provides background on
the Utah town where Wallace lived near the end of his life and where he died. Also, files
detailing the ancestry of his spouses (who were distantly related) reside on a
server in Pennsylvania, thus demonstrating the ease of cross-server linking. Please visit this demonstration by clicking on his name, above.
Comments Invited:
Please contact me with your impressions and
ideas concerning this exciting new concept. These and other issues need to be
more fully explored, standards need to be set, programs need to be written, and
data needs to be converted. A loose discussion of GenWeb is taking place in the
gedcom-l Internet mailing list. You are welcome to join us there.
19 June 1994/ Gary Hoffman/ghoffman@ucsd.edu