Thursday, April 27, 1995 8:09:34 AM GenWeb Item From: Anders Andersson,andersa@Mizar.DoCS.UU.SE,Internet Subject: Re: GenWeb Indexing To: GenWeb Gary Hoffman writes: >What we need now: >1. a volunteer master index (or 'gendex') site which can be designated >'gendex.den1.genweb.org', and Wait a minute; I'm afraid there is some misunderstanding here (I assume that "den1" is just a misspelling of "gen1", and that you are really referring to the "gen1.genweb.org" domain to be in effect on May 1st). When we argued for the creation of a deeper structure, the idea was to reserve a specific namespace for a specific use of the DNS, which at the time was (and still is) a set of CNAME records, one for each database accessed in one particular way. While the GENDEX server is associated with these databases (in fact, it's supposed to be their master index), GENDEX itself isn't one of those databases. Therefore it makes perfect sense to put it outside the "gen1.genweb.org" domain. You could call it "gendex1.genweb.org" if you envision multiple servers implementing different versions of the master index itself. The important thing isn't the actual name, but the structure. Compare it with a phone directory. The directory contains a lot of addresses of individuals, but the publisher's address is usually stated somewhere near the title page, rather than as a regular entry between Mr Puangjampa and Mrs Puchi. Of course, this isn't quite as crucial as the point of obtaining that structure in the first place. We could probably live with "gendex.gen1.genweb.org", but I feel that such a name may contribute to confusion over why that particular structure was chosen in the first place. A random mixture of careful planning and occasional ad-hoc solutions doesn't really seem worth the trouble of going through all that careful planning to begin with. Anyone, please tell me if you are bothered by my occasional fits of nerditis, or I may go on forever... ;-) -- Anders Andersson, Dept. of Computer Systems, Uppsala University Paper Mail: Box 325, S-751 05 UPPSALA, Sweden Phone: +46 18 183170 EMail: andersa@DoCS.UU.SE Thursday, April 27, 1995 9:28:18 AM GenWeb Item From: olsen@cs.byu.edu,Internet Subject: re: Time for an index? To: GenWeb >If someone is looking at my tree where I claim that John Smith is the son >of Robert Smith and sees a link to your John Smith who is the son of James >Smith, They'll want to know that we're both referring to the same person. >Hopefully, along with the link from my database to yours, I'll have a note >on why I disagree with your claims. Equally hopefully, you'll have a link >back to mine with an explanation of why you disagree with my claims. > This is exactly the reason that we need a more stable referencing mechanism than URLs which break every time a file gets moved. ____________________ Dan R. Olsen Jr. Computer Science Department Brigham Young University Provo, UT 84602 (801) 378-2225 FAX 801-378-7775 Thursday, April 27, 1995 9:50:37 AM GenWeb Item From: T.T.Wetmore,ttw@beltway.att.com,Internet Subject: Re: Some comments about WWW To: GenWeb Annelise Anderson (>): >...the founders and gurus of GENWEB...are going to have the >responsibility for establishing some standards at some point. But why now? Survival of the fittest is often all that is needed in such situations. Standardization is usually an after the fact activity. Before the fact standardization is generally ignored (except in Europe -- that's a joke, please no flames). GENWEB is still going through its wild west phase, and may need to for a year or so more. >...consider Cliff's frustration with GEDCOMs, as well as the frustrations >of many users, not only those whose GEDCOMs are rejected but the many >others whose messages we read who are having difficulty exchanging data. >The GEDCOM standard was established (and has been updated) by the Church >of Christ of the Latter Day Saints...originally for more limited purposes >than the purposes for which it is now being used by GENSERV, by programs >that convert GEDCOMs to HTML, and so forth. To take another contrary position, one I've taken many times. The problems with GEDCOM have very little to do with its intention or inherent capabilities, and very much to do with the poor stewardship the LDS/FHS has given it, and the poor quality of support that genealogical systems give to it. There is nothing inherent or antiquated or old technology in GEDCOM itself that causes these problems. I disagree with your implications that the GENWEB experimenters have some responsibility to start certifying things, but am glad you are raising the issues. GENWEB is group of by and large unorganized experimenters, trying out different ideas to make genealogical data available on the WWW. They are not a trade group, a vendor consortium, and the possibilities of the technical area are just starting to be scratched. Let's not jump the gun. >The interest in Internet genealogy is, I think, expanding rapidly enough, >including the interest in being able to produce data that is suitable for >various WWW purposes, that the power to influence the manufacturers of >the programs is on its way. Yes, they want to make money; but they will >find the sales of their products are hurt if they do not provide output >that is suitable for the coming technology. I'm not always negative, but I might as well contine the trend in this response. One. The genealogical software market is very small. Two. The slice of the genealogical software market that knows anything about the Internet or the WWW is very, very small (though growing I'll admit). Don't forget that the 30-50 or so experimenters in genealogy on the WWW and the 300-500 persons interested enough to keep track of their experiments (us!) is the extent of this whole thing, in the whole wide world of 5 billion people. It is not time to start imagining that anything we do will have a realistic impact on any part of the market. Except on me, I guess, since I am the LifeLines guy who supplies the genealogical engines and HTML generators for a number of the WWW sites, and a few other programmers with day jobs. So I'm probably wrong. Ignore this note. >While LDS may produce the best GEDCOM in the business, I do not think it >is their desire to... [garbled]... You have a much higher opinion of the LDS than I. Define what you mean by "best." I'd agree with "only" but never with "best." GEDCOM is like religion: it's supposed to bring us together, but it really separates us further (hey, that's a great quote!). Tom Wetmore, crusty old fart, ttw@beltway.att.com, 4/27/95 Thursday, April 27, 1995 10:05:28 AM GenWeb Item From: grantham@math.uga.edu,Internet Subject: Living people's vital statistics...ethics? To: GenWeb Hi. I am just starting to put my family's genealogy on my Web page, and I have a few questions of a different nature than the interesting technical questions that have been filling the list of late. 1) Should I avoid putting the birth dates of living people on the Web w/o their permission? Some companies, unfortunately, use a birthdate as a "password" over the phone, and I don't want to make life difficult for them. Some people also don't like being wished a happy birthday. 2) Should I omit the birth year? It might be reasonable to say that I don't have a right to announce to the world, say, that my Aunt Pam is 46 (whoops, sorry Aunt Pam :-) if she didn't want that known. 3) Should I omit living people altogether without their permission? Presumably anyone doing research could trace back to a deceased individual and then contact me if they want information about living relatives. On the other hand, I conceive of my web offering as more robust than just a way to exchange vital stats on people. It's the story of my family (or it will be), and I'd like to tell people for example, that my cousin Rich served in the Marines or my Uncle Terry is going back to school to become a teacher. Getting permission from close relatives (e.g., my parents). Getting in touch with and explaing the Web to distant relatives becomes more of a problem. My gut feeling is to cut off everything but birth year for relatives as close as first cousins, and chop off any relatives beyond first or second cousins. Comments? Please! Jon Grantham grantham@math.uga.edu Thursday, April 27, 1995 10:49:04 AM GenWeb Item From: Birger A. Wathne,Birger.Wathne@vest.sdata.no,Internet Subject: GenWeb - some proposals To: GenWeb I seldom have the time needed to write anything here, but I read, and I think. Perhaps not opening my mouth too often is a good thing.... I think realization of global searching is possible, but a bit into the future yet. Some experiments are required before we decide how to proceed. I think some other parts of the infrastructure must be ready first. There has been some talk about URN's again. As far as I know, URN's have not been implemented except as tests yet. But I have a proposal that would give us URN capabilities today, and that could be mapped into URN's in the future. Let's first look at the URN proposal: We need a unique reference to each registration of a person. Some of these registrations will refer to the same person, but in different databases. We will never be able to avoid this, as people may disagree, etc. So each person record around the web needs a unique reference ID. We could do this by issuing each GenWeb participant with a set of uniqe database ID's. I plan to split my own family into separate databases to keep search times lower, and simplify parallel searching. So I will need more than one database ID. Each database administrator should then make sure each person has a unique ID within this database. This could be any kind of alphanumeric reference ID, but it should not build on names, dates, or anything that might change. The number should be stable, preferably serially or automatically assigned. The combination of database ID and person ID would then be the needed data for the URN. URN's aren't there yet, but: We should be able to use this info to create a future URN looking something like URN: GenWeb-person:/// or whatever the syntax of URN's will be. The URN service could then just map these requests into something we can make available today. How to fake URN's using DNS: A DNS based system would fix any replication/stability problem. DNS is available today, and has been designed to handle this. We may have to use DNS in a non-standard way, but..... I would propose a subdomain dbptr.genweb.org containing one entry for each issued database ID. This entry should be a simple CNAME pointer to the name of the host holding the database. Thus, a reference to a host called DB0001.dbptr.genweb.org would refer to the host serving this database. As the DNS CNAME records can't be combined with other tags, we will need a separate subdomain, say dbinfo.genweb.org, to hold the other needed info for a GenWeb node. This domain could use the TXT field to hold different standardized strings. I would recommend at least DB0001 IN TXT Contact:Birger Wathne IN TXT Email:Birger.Wathne@sdata.no IN TXT Person:/cgi-bin/GenWeb/LookupInternal?BASE=$BASE/INDEX=$INDEX For nodes using plain HTML files, the last line could be IN TXT Person:~username/family/$BASE/$INDEX The variables $BASE and $INDEX should be expanded with the database name and person index. Thus, URN: GenWeb-person://DB0001/I1 would expand to URL: http://DB0001.dbptr.genweb.org/cgi-bin/GenWeb/LookupInternal?BASE=DB0001/INDEX=I1 How to use the scheme: - People using HTML files: Either modify the program making the HTML files from your source so it queries DNS automatically when seeing references to some URN format in your code, or (if you have few references), look them up manually once in a while using nslookup and hend edit the files. - People using databases with report generators: The report generator should somehow be able to do the queries. It will not be possible directly from LifeLines reports, but it should be possible to filter them, or use other report generators (I have started looking at a perl module to give access to LL databases). Problems: I think the client sites will have to upgrade their resolver libraries to bind 4.8.3? I'm not shure. Solaris 2.x is at this level already. I think we should agree on: - The structure of the genweb.org domain - The set of supported/needed strings in TXT tags. before using a lot of energy. I feel that we should be able to agree on these points now. And my proposal would be: genweb.org: - At top-level, services like www.genweb.org (GenWeb homepage), ftp.genweb.org (ftp site to get GenWeb software). - Subdomain dbptr.genweb.org with pointers to databases. - Subdomain dbinfo.genweb.org with 'URN' info. The set of supported strings: - Contact: Name of database contact person - Email: Email address of contact person - Descr: 1 line description of database - Person: Template to convert GenWeb-Person queries to URL's Should give pointer to HTML page for this person. - Index: Template to convert GenWeb-Index queries to URL's Should give pointer to HTML index page for this database. - Info: Template to convert GenWeb-Info queries to URL's Should give pointer to HTML info page for this database. - Search: Future search option. But we need to grok this first. Birger A. Wathne Email : birger@sdata.no Private: Totlandsvn. 458, N-5050 Nesttun | Tel: +47 55 10 39 96 Work : Skrivervik Data A/S | Tel: +47 55 54 37 31 Thormoehlensgt 55, N-5008 Bergen| Fax: +47 55 32 28 53 Thursday, April 27, 1995 11:05:57 AM GenWeb Item From: Rik Vigeland,rikv@wv.MENTORG.COM,Internet Subject: Re: Time for an index? To: GenWeb On Apr 27, 9:24am, olsen@cs.byu.edu wrote: > Subject: re: Time for an index? > > >If someone is looking at my tree where I claim that John Smith is the son > >of Robert Smith and sees a link to your John Smith who is the son of James > >Smith, They'll want to know that we're both referring to the same person. > >Hopefully, along with the link from my database to yours, I'll have a note > >on why I disagree with your claims. Equally hopefully, you'll have a link > >back to mine with an explanation of why you disagree with my claims. > > > This is exactly the reason that we need a more stable referencing mechanism > than URLs which break every time a file gets moved. > ____________________ > Dan R. Olsen Jr. Computer Science Department Our company creates software which uses "Soft Pathnames", but it does require an item we call a "Mapping File". This allows us to move large amounts of data from one computer to another or one site to another. Each "soft" entry in the mapping file is followed by one or more "hard" locations where the item MIGHT be found, starting with the most likely. The search software is allowed to attempt as many sites as are listed. So, we can set up "soft" names which NEVER change, AND decide how much searching we will allow. Example entres from the Mapping File look like this: $RIKV /user/rikv /net/letterman/fs/31/home/rikv $PROJECT1 /net/geraldo/fs/17/projects/project1 /net/oprah/fs/6/projects/project1.backup In this way, we can ALWAYS refer to an object by, for example, $PROJECT1/item3/subitem4/documents/file1.txt This only addresses the aspect of allowing databases to move. It does not address the fact that some individuals may be in multiple databaases with differing information. It may work for indexes as well as databases, since both may be in locations which move. I suspect that we will need multiple indexes, so that if one site "goes down" another is available. Another aspect, however, is that we will need indexes which are FAST. To accomplish this, the indexes may need to be broken up into alphabetical or subalphabetical files to minimize the search time within them. One possible way to split index files would be along soundex lines. But, alternate name spellings may also produce alternate soundex codes. OK, so I'm just thinking "out loud" on this one. Let someone smarter than me figure out the best software methods for fast indexes. In summary, "soft" locations allow indexes and databases to move, so long as someone updates the map(s) of hard locations (so where do we keep the maps?). Rik Vigeland rikv@wv.mentorg.com Thursday, April 27, 1995 1:43:42 PM GenWeb Item From: Birger A. Wathne,Birger.Wathne@vest.sdata.no,Internet Subject: Searching, surname lists, etc. To: GenWeb A surname list wouldn't be terribly useful in large parts of europe. In Norway, it was common until 1900 to inherit your fathers given name as surname. It's still the common thing in Iceland. In addition, it was quite common to change names depending on where the family lived, etc.... I have been thinking about global searching. There is a system called harves that seems ideal for GenWeb. For sites using plain HTML files, a harvest gatherer could scan the files and build a compressed database of the search terms. The gatherer can be tailored to the layout of the data, so it can build sensible data for searching. These databases can then be replicated at remote servers to distribute load, etc. Access to the collected data is through 'brokers'. For sites running database software, the broker can be tailored to search the database directly. It is also possible to build broker interfaces to search other brokers in parallel. This is my understanding from a quick reading of the features of harvest. What kind of platforms it can run on, etc, will have to investigated, and I certainly don't have the time now. Any volunteers? Web crawlers are fine for flat HTML files, but they are not the right thing for databases. I think we should find some way that will distribute searching where possible. It should avoid overloading a single server. Perhaps it will overload a bunch of servers around the world instead. Birger Thursday, April 27, 1995 2:56:09 PM GenWeb Item From: Cliff Manis,cmanis@progcons.com,Internet Subject: posting these comments To: GenWeb I tried to post this yesterday, and it bounced. Readers: Just to give you all an idea about how many names and the sizes of GEDCOM files being received on diskette. These datafiles have been received (by U. S. Postal Mail, on diskette) during the last two days. Until today, I did not check the numbers but the GenServ system is close to having 1,000,000 names in GEDCOM data at this point. ALLEBA1.GED 4726 BRAYTO1.GED 3441 CORSBO1.GED 2369 EISKJA2.GED 49520 GOLDLO1.GED 1964 LANTCU1.GED 11985 LIGHSH1.GED 2244 MARLRU1.GED 4320 MCEAIA1.GED 2234 NEUMAL1.GED 6358 NORIBI1.GED 35260 OTTIKR1.GED 23607 WALLRO1.GED 54445 ====== 198163 names received and processed in the last two days. We have approximately 50 gedcom datafiles (not included the ones above) which have not been entered in the system yet because of upgrades to the system, and I want to add those to the new system. All GEDCOM files received are renamed to something like those seem above. i.e. if your name is Tom Brown, and this is the first gedcom you have sent to the system, I would probably name that file "browto1" if there was not already a database file by that name. If you send a second gedcom file - it might be named "browto2". I'm sorry but all the databases which contain genealogy on the BROWN family history cannot be called BROWN.GED in the system. There are lots of administrative things to be done when running a system and the GenServ has its chores which must be done also. These need to be done RIGHT the first time. I don't need mistakes in the system, from wrong databases names or data sent to invalid addresses. Good luck to all, Cliff -- -- Cliff Manis cmanis@progcons.com Seoul, Korea GenServ "Genealogical Server" a service for making GEDCOM data available. For GenServ info, just send a message to: genserv-info@progcons.com WWW Genserv URL: http://www.cs.ncl.ac.uk/genuki/GenServ/ - Thursday, April 27, 1995 4:52:15 PM GenWeb Item From: Bill Minnick,svpafug@rahul.net,Internet Subject: Re: Copy of: re: Time for an index? To: GenWeb TO: Mickey Lane: (regarding your comments to Scott McGee) >>I don't know what the end meaning of all this is, but I am sure that these >>points should be considered before we decide on anything intended to be a >>final solution > >I don't think there is an end meaning. > >Something I should point out is that the discussion has sort of migrated from >indexing web pages to linked databases. I subtle but significant shift! > >In the former, you're talking about a seamless interface to the results >of private work. In the latter, you're talking of an endless thread running >through private and public parts of the material used to generate the >results - overlaid by the seamless interface to view.... etc. Mickey, Scott and other GENWEB Subscribers: My immediate reaction is: If the concept of private data bases complicates our ability to create a workable GENWEB solution, then shouldn't we eliminate the concept of private data bases from Genweb? Anyone is free to keep a private data base independent of GENWEB. It appears that the subject of GENWEB indexing can not be resolved until we nail down the scope of the GENWEB concept. Perhaps it's time to hammer out a vision statement for the broader GENWEB Concept. Here's my shot at a GENWEB Vision Statement. I want to see GENWEB do the following: 1) Let anyone search for persons based on a wide variety of search criteria, not just surnames (we probably need to index every word, including biographical and source notes, on a family or individual page to cover all bases); 2) Let anyone enter and send in corrections, additions, updates, new sources, and/or linkage updates to existing information immediately (I'd like to have a way to edit the HTML page as it sits on my monitor, then resubmit on the spot. If the information is in conflict, I'd like the originating machine to keep the original page, but store a revised page "below" the original page with a new link to my page in the original published page --- I.E., LET'S TAKE ADVANTAGE OF TWO WAY COMMUNICATIONS AVAILABLE ON THE WEB!! THIS MEANS FORGETTING THE CONCEPT OF GENEALOGY AS BOOKS ON LIBRARY SHELVES!); 3) Let anyone submit "new" individuals, families, links; sources, biographical notes (I'd like to see Genealogy-Oriented organizations -- like our Silicon Valley PAF User's Group -- fund the installation of genealogy sites on the Web with Mega-Gigabytes of disk storage and perhaps some software development so the public can easily place genealogy data into "public" GENWEB facilities -- for all of us to add to or argue about, if we have an input, until it is correct.); 4) Let us pool and share on-going genealogy research notes and relationship theories (I've said before, let's make the research notes and discussions a part of the individual information in the data base -- I've proposed the idea of "E-Mail to the deceased individual" in a GENWEB data base with cc to current researchers as a means of preserving research dialog. This research information could assist researchers a month from now or hundreds of years from now -- someone in the future who find new information could take advantage of our prior efforts. Our research won't be lost when our heirs clean out our desks!); 5) Help us locate and merge (or link) duplicate individuals and marriages (I'd like to see us develop software methods which work like spiders crawling the web and creating lists of possible duplicate individuals, or perhaps even automatically placing a link into both pages of possible duplicate persons); 7) Provide us with any genealogy software tools we might need to use GENWEB (this might include a collection of useful freeware or shareware tools for preparing and qualifying GEDCOM files, etc.); 8) Deliver genealogy data, one family at a time -- GEDCOM format, freely and immediately on request (Many of us will want to keep parallel private data bases at home, and will want to grab the latest info on our ancestors, or on the descendants of a particular person). I invite any and all GENWEBers to add to/modify the above 8 points, and perhaps we can settle on an acceptable set of objectives for GENWEB. Cheers, Bill Minnick Thursday, April 27, 1995 5:14:31 PM GenWeb Item From: LouPero@aol.com,Internet Subject: PRIVACY To: GenWeb I have had the opportunity to observe the Ancestral File at the LDS Family History centers, a database of several million names. After hearing the reactions of countless people, both members and nonmembers of the Church, I would like to say that I think the policy of not putting the names of living people on the database is a good one. I think you are risking lawsuits unless you have the specific permission of the people involved. You should probably let your uncle publish the news that he is going back to school for himself, if he wants to. Although they are your relatives, their lives are not your property. There was a good discussion thread on the subject of publishing unfavorable or unsavory information about dead people on the ROOTS-L mailing list. If people have their doubts about what can be said about dead relatives, I think we should probably have even more doubts about what we say about living relatives. Thursday, April 27, 1995 5:25:07 PM GenWeb Item From: LouPero@aol.com,Internet Subject: taking responsibility To: GenWeb I would like to add my two cents worth about someone setting up some standards. I have been listening to the GEDCOM-L mailing list for quite a while. The thing that I notice the most is that no one wants to say, "Okay, guys, all these are good ideas but this is what we are going to do." Not even the LDS Church seems willing to do that, from what I gather. Therefore, GEDCOM is essentially not working. So, I think if you want GENWEB to work, someone has to gather up all the ideas and DECIDE. After all, someone else can always start their own system if they are too disgruntled. That is the way it ought to be. Thursday, April 27, 1995 5:52:26 PM GenWeb Item From: Phlete Teachout,fteachou@eagle1.eaglenet.com,Internet Subject: Re: GenWeb - some proposals To: GenWeb Nothing personal, Birger - but you just turned me off. As an amateur genealogist whose technical capabilities are severely limited by old age, senility, and the utter lack of any formal computer training, what you propose is much, much more than I am willing to attempt. As a bit player, I will gladly make a GEDCOM available for indexing - or even several. I will gladly provide some central 'authority/master index place' with a pointer to my information. If the GEDCOM I currently use is somehow inadequate, I would even be happy to convert to a 'better' standard. But I think that is about it. I would then expect the indexing machine/bot/agent/whatever to be able to parse my GEDCOM for the information it needs for the 'master index.' The information I envision being presented in the text would be name/birth-death dates/birth-death places (but I'm flexible). I am not in the least concerned with duplications - I'm used to them. I'm not particularily interested in 'linking' to other databases - it's too unreliable. If your system is able to parse my GEDCOM/database and come up with all the requesite %, $, &, etc. - that would be fine also. - fleet - <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> < P.R. "Fleet" Teachout - Net Surfer > < fteachou@eagle1.eaglenet.com > < > < "Knowledge is of two kinds. We know a subject ourselves, or > < we know where we can find information upon it." > < Samuel Johnson 1709-1784 > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thursday, April 27, 1995 6:55:43 PM GenWeb Item From: Bill Minnick,svpafug@rahul.net,Internet Subject: Re: Searching, surname lists, etc. To: GenWeb >I have been thinking about global searching. There is a system called >harvest that seems ideal for GenWeb. Birger, I have the same feeling about Harvest as you do, and a little more info on it. Anyone can read about it by contacting the URL below. The Harvest Information Discovery and Access System at http://harvest.cs.colorado.edu/ The Harvest source code appears to be available so it can be set up on a number of separate computers. As time permits, I'll try to look into this more, but welcome others with more time and/ore experience to assess this tool. Regards, Bill Minnick Thursday, April 27, 1995 8:09:45 PM GenWeb Item From: Chris Garrigues,cwg@DeepEddy.Com,Internet Subject: Re: Copy of: re: Time for an index? To: GenWeb At 6:41 PM 4/27/95, Bill Minnick wrote: > Here's my shot at a GENWEB Vision Statement. I want to see GENWEB do the > following: > [ 8 points on what Bill thinks Genweb is/should be ] To me, what makes the genweb concept interesting is the potential ability to link billions of ancestors from millions of different researchers with differing levels of expertise, etc. I would like anybody who "knows" some portion of their family tree to be able to put this data somewhere that I have a reasonable chance of finding something useful in it. I also need to have a reasonable chance of determining the "value" of the information. I view a future in which "everybody" has an address on the worldwideweb and in which anybody who has some interest in genealogy can fairly easily say, "Here's what I know about my family" to the genweb project. At some point (hopefully soon) after they do this, a knowbot out there notices that their great grandmother might be my great granduncle's second wife and (a) tells them that my database is out there, and (b) tells me that they've just put their database on the web. We could then each look at one another's databases and decide for ourselves if they really are the same person. If so, we could then link our databases for future browsing. Once our databases are linked, if my research in the dusty rooms of some town hall in germany should turn up a new ancestor which we have in common, the act of my adding this ancestor to genweb would notify anybody else with interest in that branch of the tree that there's been a change. They might then look at my new data, and suddenly realize that some other data which they'd previously discarded now makes sense, and .... Now, obviously, this is still a pipedream at this early stage, but I think that such a model requires distributed databases, both because of the huge number of people (dead and alive) involved and because of the distributed authority. There are some very difficult issues in a distributed database with such fuzzy matching criteria, but the difficulty of those issues should not mean that we put everything in a single database and try to use standard relational database techniques. One way to think about it is that genweb could be groupware for all genealogy research in which those people who have interests in a particular area can easily find one another. Suppose you could do a search for every ancestor who lived in Stockholm between 1750 and 1800, and then send email to everybody who has flagged an interest in those ancestors asking a general question about that time and place. Clearly we need multiple indexes into this distributed database, but the surname index is the first obvious one. Spacetime comes next. Due to the potential size, an index of "every word" isn't feasible, so I have a little trouble with your point (1). Your point (4) is one that I've suggested before and still like. You'll note that part of my pipedream is an extension of that. Chris Chris Garrigues cwg@DeepEddy.Com Deep Eddy Internet Consulting +1 512 432 4046 609 Deep Eddy Avenue Austin, TX 78703-4513 USA http://www.DeepEddy.Com/~cwg/ Thursday, April 27, 1995 8:17:40 PM GenWeb Item From: Chris Garrigues,cwg@DeepEddy.Com,Internet Subject: Re: GenWeb - some proposals To: GenWeb At 7:51 PM 4/27/95, Phlete Teachout wrote: > As an amateur genealogist whose technical capabilities are severely > limited by old age, senility, and the utter lack of any formal computer > training, what you propose is much, much more than I am willing to attempt. I think it's important to realize that *many* genealogists are like Phlete. Genweb will certainly start as a system that only encompasses the computer geeks amonst us, but the goal *has* to be such that we can include all those who would rather dig through old documents than computer files. Many genealogists are retired people who got into genealogy as they began to feel increasingly mortal. They aren't (and shouldn't be) techowizards. Once the substrate is there for us geeks, there's probably a useful sidemarket in getting data from people like Phlete into the database and pulling data out of the database for them. This may even be a money making market, although any particular customer probably won't be able to pay very much. Chris (Phlete Teachout is a rather interesting name. I'm assuming that it's a real name because this is a genealogy forum, and pseudonyms are counter-productive in such a forum, so I'm curious what the roots of both your first and last name are.) Chris Garrigues cwg@DeepEddy.Com Deep Eddy Internet Consulting +1 512 432 4046 609 Deep Eddy Avenue Austin, TX 78703-4513 USA http://www.DeepEddy.Com/~cwg/ Thursday, April 27, 1995 9:59:34 PM GenWeb Item From: Michael A. PattonĄ genealolgy on the web,MAP=GenWeb@BBN.COM,Internet Subject: Re: GenWeb - some proposals To: GenWeb Date: Thu, 27 Apr 95 19:43:46 +0200 From: "Birger A. Wathne" I seldom have the time needed .... Me, too. But the interesting stuff is starting up again, so I thought I'd chime in... We should be able to use this info to create a future URN looking something like URN:GenWeb-person:/// I think you've got the syntax slightly wrong, but it occurred to me that in the early tests we won't ever actually have a URN, they'll be a later optimization... My theory on what to do is fairly similar to what you described. First, note that in most cases, some (mostly) automated process will be translating GEDCOM to HTML. When URNs are a reality, GEDCOM external references can be turned into URNs with a simple transform (I expect just adding a prefix). In the meantime they could be expanded as you describe, except I would remove some of the hair you have because I don't think it's needed...specifically the "variable substitution". My proposal: Two subdomains (I'm not stuck on specific names) the first is populated only with CNAME records to allow for persistant host naming as now, but this is independent of the pseudo-URN operation, except that the URN->URL translation is expected to use hostnames in this domain. (this might be called server.genweb.org ). the second is populated with TXT records and is used as the distributed database for translation. I would do this a little different than you have, though. For every database that wants to be named in a "network reference" in another GEDCOM DB, there's an entry here that has TXT records that can include the same sorts of things you have, but particularly I'd have: Base-URL: Which contains a full URL to which you append the rest of the pointer from the GEDCOM Index-URL: A URL which if read returns an index (in a format to be defined) This is specifically tied to the (semi-)defined network eference in the GEDCOM (not-yet) standard...so it might be called gedcom.genweb.org One reason I want to put the full URLs in here is to allow database naming and host naming to be independent, allowing among other things, multiple databases on the same server. Here's an example (fictional) written as part of genweb.org zone, using the "might be" names from above: MAP.server CNAME www.map.cambridge.ma.us. MP01.gedcom TXT "Contact:Mike Patton" TXT "Base-URL:http://MAP.server.genweb.org/map/geneal/" TXT "Index-URL::http://MAP.server.genweb.org/map/geneal/index/" AA04.gedcom TXT "Contact:Andrew Adams" TXT "Base-URL:http://MAP.server.genweb.org/genet/aa04/" TXT "Index-URL:http://MAP.server.genweb.org/genet/aa04/index/" Which shows me running one server serving a database for me and another for someone else... This would make the GEDCOM reference @MP01:I1034@ become and also translate @AA04:IX3F@ to A DNS based system would fix any replication/stability problem. Well, not while there are only two servers and they're both at UCSD, but there's a well known solution to that... -MAP Thursday, April 27, 1995 10:09:14 PM GenWeb Item From: Michael A. PattonĄ general reply address,MAP=Reply@BBN.COM,Internet Subject: Re: Copy of: re: Time for an index? To: GenWeb Date: Thu, 27 Apr 1995 16:41:17 -0700 To: genweb@ucsd.edu From: Bill Minnick My immediate reaction is: If the concept of private data bases complicates our ability to create a workable GENWEB solution, then shouldn't we eliminate the concept of private data bases from Genweb? But, the whole point of the GenWeb project is linking distributed independent databases. The single database case is already solved (with at least two pretty good examples), it just doesn't scale. -MAP Thursday, April 27, 1995 11:24:06 PM GenWeb Item From: Annelise Anderson,ANDRSN@HOOVER.STANFORD.EDU,Internet Subject: To: GenWeb From: HOOVER::ANDRSN "Annelise Anderson" 27-APR-1995 23:17:54.07 To: IN%"ttw@beltway.att.com" CC: ANDRSN Subj: RE: Some comments about www and GenServ From: IN%"ttw@beltway.att.com" 27-APR-1995 09:44:39.12 To: IN%"ANDRSN@HOOVER.STANFORD.EDU" CC: Subj: RE: Some comments about www and GenServ Tom, I'm not sure we disagree....you write: : The problems :with GEDCOM have very little to do with its intention or inherent :capabilities, and very much to do with the poor stewardship the LDS/FHS has :given it, and the poor quality of support that genealogical systems give to :it. There is nothing inherent or antiquated or old technology in GEDCOM :itself that causes these problems. Yes! That's why GENWEB should take over, at least insofar as Internet matters are concerned. GEDCOM is the interface between users and the Internet/WEB. If it's technically okay--as far as I can tell it's just moving data from one data base (records, fields) to another--why not straighten out the mess and take over the stewardship? If, for purposes of indexing data bases or the records on individuals that these data bases contain, there's a need for another kind of record, or another field in each record, then say that this would be desirable, what it's for, etc., with the implication that programs having this feature (some could implement it already using available fields) would be the best for Internet purposes. : It is not time to start imagining that anything we do will have a :realistic impact on any part of the market. I disagree, but it's a matter of judgment. I would predict that within a year there will be people offering for a fee to search for your ancestors on the entire World Wide Web, accessing the several sources of information indexed and unindexed. A well-formed GEDCOM as the price of entry to some services might make it less expensive, but it will be possible even without that. In any event, even if it is not yet realistic to have an impact on the market, it is realistic to define what features programs and GEDCOMs produced from them should have that they don't have now; if you don't know what you want from the market, you're left with what someone else offers. Incidentally, my idea of what would happen if person x in one data base were discovered (by one of the owners, probably, but not necessarily) to be probably the same person as y in another data base would be this: The discoverer would fill out an HTML form from at the GENWEB site that gave information on the names (not necessarily the same) of the two people, data on which the identification is based, names/locations of the two data bases, and e-mail addresses of the owners, and send it. The receiving "program" would forward a message back to the owners at their e-mail addresses informing them of this identification and asking them to put a flag in the person's record, e.g., GEDWEB ID=xxxx. It would be different for each person. The GENWEB ID INDEX would them keep a record that person x ID=xxxx=person y ID=yyyy. The data base owners would also enter the ID in a special field that outputs to GEDCOM in the programs they were using, with an appropriate tag. Thus, future users of the GENWEB ID INDEX or either of the data bases would discover these identities and could follow them up. No automatic program would change the data in anyone's data base. Not sure how this fits in with any other proposals; it just seems to me that these identities are very important information. Best regards, Annelise Friday, April 28, 1995 12:47:45 AM GenWeb Item From: D.Evans@BoM.GOV.AU,Internet Subject: Is there a DIGEST mode? To: GenWeb Greetings, All, from the Land Down Under. I have 'spoken' to LISTSERV, but cannot find a DIGEST option. If I'm missing something really obvious, would someone please tell me, otherwise I will have to UNSUB. Thank you. David ... the sap in the family tree Friday, April 28, 1995 1:38:51 AM GenWeb Item From: Anders Andersson,andersa@Mizar.DoCS.UU.SE,Internet Subject: re: Time for an index? To: GenWeb >>If someone is looking at my tree where I claim that John Smith is the son >>of Robert Smith and sees a link to your John Smith who is the son of James >>Smith, They'll want to know that we're both referring to the same person. >>Hopefully, along with the link from my database to yours, I'll have a note >>on why I disagree with your claims. Equally hopefully, you'll have a link >>back to mine with an explanation of why you disagree with my claims. >> >This is exactly the reason that we need a more stable referencing mechanism >than URLs which break every time a file gets moved. No. We *do* need a more stable referencing mechanism than URLs, but that will only work towards solving the problem of moving or inaccessible documents. There will still be disagreements over the identities or relationships of individuals (perhaps due to incomplete or unreliable sources), so some kind of "disputed link" scheme like the one described above will be necessary, whether by URLs, URNs or some other mechanism. I'd like to second Brian Randell's point about solving separable problems separately. -- Anders Andersson, Dept. of Computer Systems, Uppsala University Paper Mail: Box 325, S-751 05 UPPSALA, Sweden Phone: +46 18 183170 EMail: andersa@DoCS.UU.SE Friday, April 28, 1995 2:03:58 AM GenWeb Item From: Peter and Ann Floyd,sunshine@acslink.net.au,Internet Subject: Re: Some comments about www and GenServ To: GenWeb CMANIS said to us recently ... [I snipped a lot of this for brevity only] >What can we do about the producers of genealogical programs who >say they will import and export GEDCOM data ? Many of them do not >even write a datafile which can be reaby any program, but >they say "Gedcom Compatible". > >... > >Good luck to all, Cliff Manis > > >-- We HAVE to make a stand. IF a particular program says it is GEDCOM compatable, but turns out it isn't, then we need to do a few things ... Consumer Affairs (or what the equivalent is in other countries) need to know (it is false advertising). Genealogists on the 'net need to know (tell us here, tell us in the soc.genealogy.* newsgroups etc). There is a GEDCOM standard document. If it isn't clear enough then we need to improve on it. There are enough people on the web who can draft up good quality documents - find some. If it is clear enough, and a program doesn't meet them ... then we should get them blacklisted until they do (take a look at the recent Intel Pentium fiasco - the 'net has some power out there). Cheers:-------------------------------Peter (& Ann) Floyd\ |Email:Sunshine@acslink.net.au Revelations 14:6-7.| \Genealogical Research: WHITBY anywhere,anytime. Mail me!/ Friday, April 28, 1995 2:56:51 AM GenWeb Item From: Martin van Keulen,a.l.h.j.vankeulen@student.utwente.nl,Internet Subject: Re: GenWeb - some proposals To: GenWeb Flame me privately, but I feel the need to reply to this: >At 7:51 PM 4/27/95, Phlete Teachout wrote: > >> As an amateur genealogist whose technical capabilities are severely >> limited by old age, senility, and the utter lack of any formal computer >> training, what you propose is much, much more than I am willing to attempt. > >I think it's important to realize that *many* genealogists are like Phlete. >Genweb will certainly start as a system that only encompasses the computer >geeks amonst us, but the goal *has* to be such that we can include all >those who would rather dig through old documents than computer files. Many >genealogists are retired people who got into genealogy as they began to >feel increasingly mortal. They aren't (and shouldn't be) techowizards. > >Once the substrate is there for us geeks, there's probably a useful >sidemarket in getting data from people like Phlete into the database and >pulling data out of the database for them. This may even be a money making >market, although any particular customer probably won't be able to pay very >much. > > History tells us to "relax" about the development of (new) technologies. Technologies/ideas are not "winners" because there are criteria which make a technique/system "better" or "the best". People decide which they like best and stick to that decision. Because the younger generation has the longer breath, in the end mostly their choice prevails. Being a "young dog" myself, I've grown up with the Macintosh "touch and feel" ;-). The World Wide Web combines (part of) this concept with the computing power of Unix. The choice of my new generation lies with these, the so-called Graphic User Interfaces. Martin van Keulen phone: +31-53-89 50 46 fax: +31-53-89 40 97 e-mail: a.l.h.j.vankeulen@student.utwente.nl KIVI@student.utwente.nl Friday, April 28, 1995 4:04:13 AM GenWeb Item From: Anders Andersson,andersa@Mizar.DoCS.UU.SE,Internet Subject: Re: Searching, surname lists, etc. To: GenWeb Birger Wathne writes: >A surname list wouldn't be terribly useful in large parts of >europe. In Norway, it was common until 1900 to inherit your fathers >given name as surname. It's still the common thing in Iceland. >In addition, it was quite common to change names depending on where >the family lived, etc.... Quite true. A surname index is fine, but it's not the only index we want. A geographical location index would be an appropriate companion. Then you could have all sorts of specialized indices, but these two are to me the most obvious pieces of information to use when searching for associated material. Karen Isaacson has for a number of years maintained the `Roots Surname List' (listing researchers working on a particular surname), and in 1992 also started a corresponding `Roots Location List' (which she later turned over to Suzanne Badenhop ). I haven't followed how these lists have developed lately, and they are probably not sufficient for our needs, but perhaps we could check with them whether we may benefit from a joint effort. Both lists are available at for searching. -- Anders Andersson, Dept. of Computer Systems, Uppsala University Paper Mail: Box 325, S-751 05 UPPSALA, Sweden Phone: +46 18 183170 EMail: andersa@DoCS.UU.SE Friday, April 28, 1995 5:45:57 AM GenWeb Item From: Jim Higgins,HIGGINS@dnenet.nov.dne.bnl.gov,Internet Subject: Re: Living people's vital statistics...ethics? To: GenWeb From: Jon Grantham > Subject: Living people's vital statistics...ethics? > Reply-To: grantham@math.uga.edu > > Hi. I am just starting to put my family's genealogy on my Web page, > and I have a few questions of a different nature than the interesting > technical questions that have been filling the list of late. > > 1) Should I avoid putting the birth dates of living people on the Web > w/o their permission? Some companies, unfortunately, use a > birthdate as a "password" over the phone, and I don't want to make > life difficult for them. Some people also don't like being wished a > happy birthday. > > 2) Should I omit the birth year? It might be reasonable to say that > I don't have a right to announce to the world, say, that my Aunt Pam > is 46 (whoops, sorry Aunt Pam :-) if she didn't want that known. > > 3) Should I omit living people altogether without their permission? > Presumably anyone doing research could trace back to a deceased > individual and then contact me if they want information about living > relatives. On the other hand, I conceive of my web offering as more > robust than just a way to exchange vital stats on people. It's the > story of my family (or it will be), and I'd like to tell people for > example, that my cousin Rich served in the Marines or my Uncle Terry > is going back to school to become a teacher. > > Getting permission from close relatives (e.g., my parents). Getting > in touch with and explaing the Web to distant relatives becomes more > of a problem. > > My gut feeling is to cut off everything but birth year for relatives > as close as first cousins, and chop off any relatives beyond first or > second cousins. > > Comments? Please! > > Jon Grantham > grantham@math.uga.edu This is an interesting dilemma that I ran into with my mother-in-law. She was 5 years older than her husband and back when she was married there was some controversy (I am told) with her husband's mother about him marring an older woman. Well she "cheated" on her birthdate by 3 years, making herself younger. This was perpetuated in lots of documents, including her driver's license. Of course when I was doing the genealogical research on birth and baptismal records, the truth came out. She was aghast and made me promise to use the same date in my files (that I sent to all the cousins, etc.) that she had always been using. Naturally I felt uneasy but complied. Now that she has passed away (bless her soul) I am maintaining the correct date. In summary, this is not an easy question to answer and probably needs to be handled on a case basis. This issue is related to the whole question of not releasing the federal census records until some 50, 60, or 70 years after the census. Have a nice day! Jim Higgins Phone: 516-282-2432 Fax: 516-282-4900 Friday, April 28, 1995 6:35:21 AM GenWeb Item From: Birger A. Wathne,Birger.Wathne@vest.sdata.no,Internet Subject: Re: GenWeb - some proposals To: GenWeb >From: Phlete Teachout >Subject: Re: GenWeb - some proposals >To: "Birger A. Wathne" >Cc: genweb@ucsd.edu >Mime-Version: 1.0 > >Nothing personal, Birger - but you just turned me off. > Well, that's a risk I decided to take. After all, we have to discuss the technicalities as well as all the nice ideas. Ideas without implementations are quite worthless. I was fully aware that my message would get a bit too technical for many of you. I guess I could have smoothed the edges somewhat, but I was rather tired when writing it..... To explain some things a bit better: If all You want is to submit a GEDCOM file to a central repository, Send it to one of the established bases. If you want to make your GEDCOM files available to the web in HTML format you will have to convert them to HTML files, or import them to a database built somewhat like mine (with WWW front end). Somehow we need to establish some playing rules, and develop some easy to use software that enables people to publish their data on the GenWeb without really knowing all these gory details. What I envision is something like this: You have a GEDCOM file, and the possibility to make the data available either through WWW, gopher or ftp. If you have the possibility to put it up through WWW (http) server, you could choose between a database format or flat files. For gopher or ftp, you have to use flat files. Based on this decision either download all you need to get a database running, or a program that converts your gedcom into a set of HTML files. Register Your database with GenWeb to get a unique database reference. Also tell the GenWeb admin what the path to your database looks like. Genreate the database or the HTML files. That's all it would take. The registration of your path, and the usage of a database id would enable You to move the database around, and just notify the GenWeb admin. All links to the database should still be valid. That's the main difference from today. And I think it's important that a link to a person in someone else's database should be stable, even if the database moves to another server, etc. For the data provider, there would be no big differences from the methods available today. As we have only this one list, we have to discuss both technical issues and other (perhaps more important) issues like ethics, useability, etc here. This list started life as a very technical one. I'm glad for all the other input here as well. I hope we can all coexist.... Birger Friday, April 28, 1995 12:42:09 PM GenWeb Item From: Bill Minnick,svpafug@rahul.net,Internet Subject: Re: Copy of: re: Time for an index? To: GenWeb > Date: Thu, 27 Apr 1995 16:41:17 -0700 > To: genweb@ucsd.edu > From: Bill Minnick > > My immediate reaction is: If the concept of private data bases complicates > our ability to create a workable GENWEB solution, then shouldn't we > eliminate the concept of private data bases from Genweb? COMMENT FROM Michael A. Patton: > >But, the whole point of the GenWeb project is linking distributed >independent databases. The single database case is already solved >(with at least two pretty good examples), it just doesn't scale. Michael, I fully agree with you that the GENWEB data bases must be distributed. Any data base that is set up to accept, save and display (on demand) public inputs will use the Web to add value to their data base. Anyone who does not facilitate receipt and distribution of public comment on individuals in a data base gains little by placing a data base on the Web. In fact I predict data bases on the Web which do not support dialog among all interested parties, will be bypassed by the "interactive" data bases. I did not intend to advocate development of a single data base. Cheers, Bill Minnick Friday, April 28, 1995 1:09:16 PM GenWeb Item From: T.T.Wetmore,ttw@beltway.att.com,Internet Subject: Re: Some comments about WWW and GenServe To: GenWeb >Tom, I'm not sure we disagree....you write: [...lots of gibberish by Tom deleted...] I'm not sure either; I enjoy being a devil's advocate and being contrary at times (I'm too old to be nice all the time; my smiling muscles get to aching too much!). >...GEDCOM is the interface between users and the Internet/WEB. If it's >technically okay--as far as I can tell it's just moving data from one data >base (records, fields) to another--why not straighten out the mess and >take over the stewardship? [Warning -- the next few paragraphs contain the standard Wetmore tirade against GEDCOM. Just wanted to point out ahead of time that I am not disagreeing with Annelise in any way (secretly I believe he is right, but I'd never say so in public -- whoops, please disregard that).] The major problem with GEDCOM in practice is that it worsens the problems that it was intended to solve. That problem is, of course, the transmittal of genealogical data from one database system to another. The typical genealogical database uses a simple internal database model for holding its data. GEDCOM import requires converting data from the import file to that internal format. But since the GEDCOM semantic standard is so ambiguous, and since the individual database developers have their own interpretaions as to what's important, and what things mean, GEDCOM import is almost always a destructive operation. Ditto on GEDCOM export. If one system imports a GEDCOM file from another, and then immediately exports the same data, you can be sure that you will be unhappy with the results. There are expensive systems that cannot even import GEDCOM files that they THEMSELVES exported without changing or loosing some of the data. Revoltin'. You know that game where you whisper a phrase in someone's ear, and then they whisper it in the next person's ear, and so on, until you get to the the last person who says out loud the message they heard? Moving your data around in GEDCOM format is just like one of those whispers! You never know what's going to come out at the other end! This is why I call the GEDCOM medicine worse than the disease. GEDCOM in theory, however, is an excellent format for both transmitting and storing genealogical data (though this is an opinion that many others disagree with, but that's okay, since they're wrong). If databases stored their data in GEDCOM format then there would be no data destruction on data import or on data export. However, this does place a big burden upon the database systems to handle all the weird things that other systems and people do to the data in the GEDCOM, as the standard is still so ambiguous that many interpretations of value types and structures exist. I have angered a few persons by implying that they don't take my viewpoint on this matter seriously because they are not clever enough to solve this problem. Two reasons why LifeLines has become popular for WWW sites is that it stores its databases in GEDCOM format and therefore accepts input files from all comers without prejudice (and without data deletion or modification), and it is very easy to make LifeLines generate any kind of output from its databases, whether they be in HTML, Postscript, TeX, RTF or whatever. I am interested by your ideas that GENWEB take on the matter of establishing its own standards for genealogical data interchange. My natural contrariness sees first the difficulties of such an undertaking, dwelling on the GEDCOM failures, but the results, I admit, would be wonderful. >>It is not time to start imagining that anything we do will have a >>realistic impact on any part of the market. >I disagree, but it's a matter of judgment. I would predict that within >a year there will be people offering for a fee to search for your >ancestors on the entire World Wide Web, accessing the several sources of >information indexed and unindexed. A well-formed GEDCOM as the price of >entry to some services might make it less expensive, but it will be >possible even without that. You are probably right. I hadn't thought of that, Luddite that I am. Maybe we should get some royalty rules worked out and built into our database accesses. Come to think of it, it would be very easy to have a LifeLines based system output billing statements based on database access. Even the LifeLines author might be due, let's see, maybe a hundreadth of a cent royalty for each person accessed. Whooah, be still my beating heart! >In any event, even if it is not yet realistic to have an impact on the >market, it is realistic to define what features programs and GEDCOMs >produced from them should have that they don't have now; if you don't >know what you want from the market, you're left with what someone >else offers. Good points. But even if you knew what you wanted from the market, my point is that the market is so small and thread bare, that you probably won't find what you want. Unless of course you are willing to use UNIX. >Incidentally, my idea of what would happen if person x in one data base >were discovered (by one of the owners, probably, but not necessarily) >to be probably the same person as y in another data base would be this: >The discoverer would fill out an HTML form from at the GENWEB site that >gave information on the names (not necessarily the same) of the two >people, data on which the identification is based, names/locations of >the two data bases, and e-mail addresses of the owners, and send it. >The receiving "program" would forward a message back to the owners at >their e-mail addresses informing them of this identification and asking >them to put a flag in the person's record, e.g., GEDWEB ID=xxxx. It would >be different for each person. The GENWEB ID INDEX would them keep a >record that person x ID=xxxx=person y ID=yyyy. The data base owners >would also enter the ID in a special field that outputs to GEDCOM in >the programs they were using, with an appropriate tag. Thus, future >users of the GENWEB ID INDEX or either of the data bases would discover >these identities and could follow them up. No automatic program would >change the data in anyone's data base. Not sure how this fits in with >any other proposals; it just seems to me that these identities are very >important information. Interesting. Though this may not be the final "protocol" for solving this problem, it opens a great vision to what will be possible. Not to sound like I'm always pushing LifeLines ('cept that I usually am), but the structures to implement this protocol already exist within it. Since LL allows user defined records, special name link records are directly implementable, LL can link its records together in any way, and since LL output is fully programmable, any files generated by LifeLines, whether GEDCOM or HTML or whatever, could carry this name linked info in any desired format. Tom Wetmore, 4/28/95, ttw@beltway.att.com Friday, April 28, 1995 2:16:39 PM GenWeb Item From: Birger A. Wathne,Birger.Wathne@vest.sdata.no,Internet Subject: Re: GenWeb - some proposals To: GenWeb >From: "Michael A. Patton" > We should be able to use this info to create a future URN looking > something like URN:GenWeb-person:/// >I think you've got the syntax slightly wrong, but it occurred to me >that in the early tests we won't ever actually have a URN, they'll be >a later optimization... My theory on what to do is fairly similar to As I said, no time to read those rfc's. Not really time to write this note in my working hours either :^) The GEDCOM syntax, the URN syntax and the DNS syntax will have to be corrected by those who know. Anders Andersson ponted out to me that the TXT tags should propably use '=' instead of ':' to follow recommendations in rfc 1464. I also have to point out that discussions with Anders have shaped much of my thought around this subject. He seems to know all those rfc's that I never find the time to read. And our ideas seem to be quite close. I should have mentioned this in the original message, but I have already explained that I was tired..... As to the inclusion or exclusion of variables in the URL's; Excluding variable substitution would mean that the person reference ID will always have to be last in the URL. De we want this restriction? Would we want to go the other way, and enable more variables? In my proposal I kept variable substitution simple. We may want more elaborate methods (Speak up, Anders ;). As this should be hidden to users, I don't see any reason for doing it simple if there is something to be gained by a variable substitution mechanism. Hmmm back to work before my boss finds out. Birger Friday, April 28, 1995 5:40:12 PM GenWeb Item From: Mickey Lane,MLANE@CSI.compuserve.com,Internet Subject: stuff To: GenWeb Martin van Keulen sez: >History tells us to "relax" about the development of (new) technologies. >Technologies/ideas are not "winners" because there are criteria which make a >technique/system "better" or "the best". People decide which they like best >and stick to that decision. > >Because the younger generation has the longer breath, in the end mostly >their choice prevails. > Does this mean we can blame MS-DOS on the young squirts? Mickey. Friday, April 28, 1995 6:06:26 PM GenWeb Item From: Anders Andersson,andersa@Mizar.DoCS.UU.SE,Internet Subject: Re: GenWeb - some proposals To: GenWeb Mike Patton writes: >simple transform (I expect just adding a prefix). In the meantime >they could be expanded as you describe, except I would remove some of >the hair you have because I don't think it's needed...specifically the >"variable substitution". I discussed this privately with Birger earlier, and my own idea behind such "hair" (maybe Birger has other ideas) was to provide greater flexibility with respect to the URL template, allowing the index key to appear anywhere in the URL (considering that some databases currently in operation don't put the key last). Another use might be user-defined presentation format, such as obtaining the HTML page in a preferred language (when available). However, perhaps the added functionality isn't worth the additional complexity of the specification and the application software? I haven't really decided on this myself. A very reasonable approach would be to stick to a simple and clean solution to begin with, and decide later whether we want a second version with more features. >One reason I want to put the full URLs in here is to allow database >naming and host naming to be independent, allowing among other things, >multiple databases on the same server. > >Here's an example (fictional) written as part of genweb.org zone, >using the "might be" names from above: > MAP.server CNAME www.map.cambridge.ma.us. > MP01.gedcom TXT "Contact:Mike Patton" > TXT "Base-URL:http://MAP.server.genweb.org/map/geneal/" Now that we have the URL in a TXT record which (hopefully) can be updated at will, do we really need to maintain a CNAME record for each server as well? A move from one server to another will pretty likely imply changing also the local path, meaning that both records will probably have to be updated at each move. Why not simply put "http://www.map.cambridge.ma.us/map/geneal/" in the TXT record? Another (perhaps insignificant) concern is that the DNS RFCs seem to recommend against having CNAME records point to non-canonical names. Now, "www.map.cambridge.ma.us" happens to refer directly to an IP number, but many HTTP servers are advertised as WWW.dom.ain which in turn has just a CNAME record pointing to some physical host. Second, I'd suggest allowing RFC 1464 to specify the syntax for our TXT records, meaning that an equals sign (=) should be used instead of a colon (:) to separate attribute name and value. > A DNS based system would fix any replication/stability problem. > >Well, not while there are only two servers and they're both at UCSD, but >there's a well known solution to that... My tiny mailing list is still functional, though pretty unused. Perhaps we can discuss the more intricate DNS operational issues there, since they have little to do with the GenWeb application itself? I should probably update my WWW pages, by the way. -- Anders Andersson, Dept. of Computer Systems, Uppsala University Paper Mail: Box 325, S-751 05 UPPSALA, Sweden Phone: +46 18 183170 EMail: andersa@DoCS.UU.SE Friday, April 28, 1995 8:19:45 PM GenWeb Item From: Cliff Manis,cmanis@progcons.com,Internet Subject: re: Some comments about WWW and GenServ To: GenWeb Readers: Ref part of this message on Fri Apr 28 19:16:39 1995, Tom Wetmore said: > From: ttw@beltway.att.com (T.T.Wetmore) > To: gedcom-l@vm1.nodak.edu, lines-l@vm1.nodak.edu, genweb@UCSD.EDU > Subject: Re: Some comments about WWW and GenServe > > Maybe we should get some royalty rules worked out and built into our > database accesses. Come to think of it, it would be very easy to have a > LifeLines based system output billing statements based on database access. > Even the LifeLines author might be due, let's see, maybe a hundreadth of a > cent royalty for each person accessed. Whooah, be still my beating heart! Tom, are you sitting down ? If not please do. Quick take a pill ! Now read this. During the last two weeks, the GenServ system alone has produced over 31 megs of output to its authorized users. Wow, why not charge .001 per byte. Even at .001 cent per byte that is over $31,000 US dollars. I need to know where to send the check before I spend all the money ! ! ! Since the GenServ is a "No Cost" system, those who are interested in it may learn more about it by seeing my .sig and requesting the 'genserv-info' message. Please do not reply to me for info about the GenServ system. I'm sure that some of the www (Web) system will be charging for Genealogical Database Information soon if they are not now. IMHO, many of them will not have the reports available that are already available now - FOR FREE - on the GenServ. In several Email posts lately, I have seen much about preparing all the different data formats, ways to stabalize and format data for new systems. Probably, 98% of all the data on the GenServ system was sent to me by those who did not and do not understand all the intricacies of the GEDCOM data format. For many of them the datafile they sent was the first GEDCOM file they ever made. The GenServ system was designed as a project to let its users share their own data and see the 'hard-earned' data which was being collected by others. This data is from: 1. those who go visit the library on a regular basis. 2. those who just want to add one more name to a database. 3. those willing to share their data in hopes of finding another name. 4. those who are still able to chat with Grandma and get information. 5. those who want to retain all the rights to the data they have found. 6. those who are just willing to share. The GenServ system is available for anyone willing to share their data and send it to me in an acceptable/readable GEDCOM datafile. Many of these people do not know about www. Many of them have never used a modem and may not during their lifetime. But - they have good data - and it's on the GenServ system now Real soon the GenServ system will have over 1,000,000 names in GEDCOM data and on-line. ^^^^^^^^^^^^^^^^^^^^ Below are just part of the GenServ system reports which are produced each week. These reports and several others showing who-is and what-has been requested, have been available for me to see the growth for a longtime. Here are the GenServ reports for the last two weeks. GenServ Activity Report for 21 Apr 95 Report type requested bytes msgs ============================ ========= ====== search: 9411144 677 soundex: 2907446 84 send: 795013 79 report: 184708 154 match: 18193 14 searchcount: 2742 14 GenServ Activity Report for 28 Apr 95 Report type requested bytes msgs ============================ ========= ====== search: 13193560 905 soundex: 3450920 106 send: 1121933 109 report: 310309 290 match: 18652 15 searchcount: 3134 16 Good Luck to all, Cliff Manis -- Cliff Manis cmanis@progcons.com Seoul, Korea GenServ "Genealogical Server" a service for making GEDCOM data available. For GenServ info, just send a message to: genserv-info@progcons.com WWW Genserv URL: http://www.cs.ncl.ac.uk/genuki/GenServ/ - Saturday, April 29, 1995 1:07:48 AM GenWeb Item From: Gary Hoffman,ghoffman@ucsd.edu,Internet Subject: Re: Is there a DIGEST mode? To: GenWeb Sorry, we don't have a digest mode, but I am compiling an archive at http://demo.genweb.org/genweblist/genweblist.html. BTW, anyone who wants to unsub this list should send their message to listserv@ucsd.edu. Just put in the body of the message: unsub genweb. I try to pick up the unsub's sent to the list, but it's a manual operation. Cheers, Gary *************************************************************************** *Gary B. Hoffman, Computer/Language Lab Director e-mail: ghoffman@ucsd.edu* *Graduate School of International Relations and Pacific Studies (IR/PS)* *University of California, San Diego (UCSD) voice: (619) 534-7733* *9500 Gilman Dr., La Jolla, CA 92093-0519 USA fax: (619) 534-5727* *************************************************************************** Saturday, April 29, 1995 2:59:55 AM GenWeb Item From: Birger A. Wathne,Birger.Wathne@vest.sdata.no,Internet Subject: Re: GenWeb - some proposals To: GenWeb >Now that we have the URL in a TXT record which (hopefully) can be >updated at will, do we really need to maintain a CNAME record for >each server as well? A move from one server to another will pretty >likely imply changing also the local path, meaning that both records >will probably have to be updated at each move. Why not simply put >"http://www.map.cambridge.ma.us/map/geneal/" in the TXT record? How would people keep a stable reference to your base (except through GenWeb software at each client site)? We need to have the URL point to a CNAME record to enable you to add a person to your Web browser hotlist, etc, and be shure it stays stable. We could require that the host given in the CNAME record should be the real, canonical name. But it would be better if a CNAME pointing to another CNAME is ok. Let's move this discussion to genweb-dns-op@kay.docs.uu.se. Birger Saturday, April 29, 1995 8:20:01 AM GenWeb Item From: Mickey Lane,MLANE@csi.compuserve.com,Internet Subject: Re: GenWeb - some proposals To: GenWeb Anders Andersson writes, >Second, I'd suggest allowing RFC 1464 to specify the syntax for our >TXT records, meaning that an equals sign (=) should be used instead >of a colon (:) to separate attribute name and value. Could you expand on this? The ROOTSBOOK web server currently uses "http://..../GenWeb.exe?BBBB:NNN" where BBBB is an alphanumeric {database|book} name and NNN is an entry number. Would you suggest something different? Now would be a real good time to do so... :-) Mickey. Saturday, April 29, 1995 8:32:28 AM GenWeb Item From: Mickey Lane,MLANE@csi.compuserve.com,Internet Subject: ROOTSBOOK To: GenWeb Greetings, With all the discussion in recent days about indexes and linked databases and so forth, I think it would be appropriate to spell out what I'd like to see happen to the ROOTSBOOK software package. It already does a lot of what's being discussed. For those not familiar with my babbles over the years - ROOTSBOOK is a software package I wrote that does linked databases. This note is not about linked web pages or indexes of such, it's about linked data files. Here's what I'd like to see: Phase I - Complete the work under way to make the web server functional and turn it loose on the general public. This is almost complete - I'm waiting for a compiler to come in so I can run it all in native mode on an Alpha NT server. It's currently running on my workstation which is OK I guess but it gets in the way of the stuff I'm getting paid to do... which isn't genealogy :-( Phase II - Upload the ROOTBOOK sources to genealogy.emcee.com. The owner of that system has kindly provided space for this. Most of the sources have already had the GNU general license added. Actually, I could do this at any time. Phase III - Clean up the ROOTSBOOK code. There's a world of difference between code you experiment with for your own amusement and stuff you release for public consumption. I think the major structure of the code is good but there's a number of functions that should be looked at. I'd like to see it run on a Unix box. It should. (Actually, the code's not that bad - I just have to get my excuses in first.... It's already fully commented.) At this point, we would be able to have several identical but disconnected sites serving the various databases. Prior to advancing the state of the art, we'd need to conduct: Phase IV - Define a standard for database names and some sort of registration for them. I think it's reasonable to state that you can't have two "public" databases with the same name. It would probably suffice for the next year or so to do this manually using two or three interested people and mail/list messages. I've got some thoughts on how this should be done as I'm sure others do. Phase V - Develop a series of servers to provide the current locations (plural!) of the databases defined as a result of Phase IV. Note that providing the list of accepted names, the locations of them and the contents of the indicated databases are three different things. There are probably a lot of tools already functioning that will do the server functions but we'd need to define how to handle the actual data. Up to this point, I haven't said anything that doesn't apply to any of the available software. What follows is ROOTSBOOK specific. A couple of background notes: * ROOTSBOOK doesn't use GEDCOM. A tool exists to convert the GEDCOM format into what it does use. No tool exists to go the other way. This latter bit is not hard to do, it just hasn't been done. * ROOTSBOOK uses a whole bunch of small linked databases instead of one big one. The processing software is based on this structure. * The main chunk of the ROOTSBOOK software, the bit that converts the database info to presentation format, is client/server oriented. There are two major client/server relationships: the location of a named database (aka book) and the information available for any given book/entry. The people information only deals with book names - it has no clue where the other databases actually are. Phase VI - Develop a series of servers to implement the 2nd client/server relationship that ROOTSBOOK requires. This is also not hard to do. What is hard is getting everyone to agree on how it's to be done. :-) Once the two client/server tools exist (book location & entry information), the existing ROOTSBOOK sites could turn on these tools and we would have publically linked databases! Sort of. Well, not really. What we'd need to do then is: Phase VII - Modify the sea of existing databases to accomodate links. There's a lot of duplicates on the current ROOTSBOOK web server and it's only hosting 100,000 people. Someone recently mentioned the word 'spider' in conjunction this subject and I had to chuckle. 'Spider' is the name of the tool that I plan to write to identify candidates in seperate databases for linking. This brings us to what I think may be the stickiest part of the whole project. As it stands today, I take an existing GEDCOM file and convert it to a ROOTSBOOK file and run the software on it. I don't read either the .GED file or the resultant .GDB file. If it works, great. If it doesn't, and about 30% don't, I put the whole thing in a "need-to-work" directory and save it for later. What this all means is the ROOTSBOOK stuff is all traceable back to the program that was used to type the names in. Changes to the information can be propagated to the ROOTSBOOK output with no additional work. When the 'spider' runs, we loose that traceability and two databases now exist. This problem is not specific to ROOTSBOOK. No matter what system eventually evolves, people's existing work will have to be modified if we're ever going to have linked databases. (As a side note, none of this means that 'someone' is going to be telling you what to do with your private material. It means that if you want to participate in the global database, there's some rules you have to follow to make it all work. Naturally, the easier the rules are to follow, the more people will participate.) I think there's two ways to solve this problem. The first is to have a committee of people who accept GEDCOM data from various sources, run assorted tools on it like 'spider' and assume ownership(*) of the result. Changes to the accumulated material would be made by the committee at the request of anyone who could provide updates. (I use 'committee' for lack of a better term. If some means of commenting on things via mail messages were to evolve, the tool receiving the mail messages and updating the database entries would *be* the committee.) The second method would be to have some stupendous set of guidlines that folks would have to follow in order to publish data. Given my experiences with the liberties currently taken with GEDCOM, this won't work. If the set of guidelines were developed and say twenty people were able to understand them :-), those twenty would be the committee. I'm not suggesting that anything one normally associates with committees like voting, etc. take place. I don't see ROOTBOOK's special database format as a drawback to implementing any of this. One, GEDCOM doesn't support links. Some proposals exist to add this but none of the current packages (that I know of) make use of it. Two, going to something clearly different from GEDCOM will enable us to add the functionality we want in the manner best suited for the job. Three, using something completly different may aid in keeping things organized. Phase VIII - Devise a policy for distributing the public databases around the different servers. Put these databases under the same protection that any commercial enterprize might use: scheduled backups, recovery procedures and sufficient redundancy to accomodate network hickups and so on. Get to the point where we have one global database. (A lot of notes recently have pointed out that different people have different opinions about who begat whom. One global database does not imply that these differences of opinion have to be resolved, it implies that all of the information declared to be public live within one storage scheme. If we need six entries for the same person outlining six different presumed relationships, so be it.) Phase IX - Implement interactive updates to the global database. It'd be a shame if someone were to enter data into something that wasn't permanent or was the 'wrong one.' As always, comments gladly accepted. Mickey. (*) "Assume ownership" can have several meanings. If you do the library work, the knowledge and credit is forever yours. You own it. On the other hand, you can't have two people editing the same file. Talk to any software engineer about it. Ownership in this sense refers to the person who has responsibility for making changes to the file. You, as the creator of the knowledge, may direct the person who currently owns the file on how to change it -or- you may do it yourself. If you do it yourself, you're temporarly taking ownership of the file and it stops being a public file and becomes a private one. When you finish, you re-submit it and it becomes a public file again or more accuratly, becomes *another* public file. Saturday, April 29, 1995 11:00:46 AM GenWeb Item From: Anders Andersson,andersa@Mizar.DoCS.UU.SE,Internet Subject: Re: GenWeb - some proposals To: GenWeb Birger writes in response to me: >>will probably have to be updated at each move. Why not simply put >>"http://www.map.cambridge.ma.us/map/geneal/" in the TXT record? > >How would people keep a stable reference to your base (except through >GenWeb software at each client site)? The TXT records are intended to be used by GenWeb software to create hypertext links as needed. While CNAME records only provide us with soft server names, requiring us to stick to a standard URL template, TXT records enable us to register arbitrary URLs. >We need to have the URL point to a CNAME record to enable you to >add a person to your Web browser hotlist, etc, and be shure it stays stable. If Mike's database at is moved to , a stable server alias name won't help a lot. You can redefine MAP.SERVER.GENWEB.ORG to have a CNAME of HTTP.ACME.COM, but any saved URL with the path would still be invalid unless ACME.COM agrees to support this URL. If we are to support permanent URLs, then we'll have to standardize on the access scheme and local part of the URL anyway, in which case a TXT record won't add any functionality over a CNAME record (the initial idea which formed part of the basis for establishing the GENWEB.ORG domain in the first place). I think that's a different problem, and it isn't even specific to GenWeb. >Let's move this discussion to genweb-dns-op@kay.docs.uu.se. No, please don't. That list is intended for a pretty small group of people concerned with GENWEB.ORG DNS operations, not with software development. I understand that many readers of the GenWeb mailing list are not really concerned with the technicalities of software development either, and maybe we should form a separate group for that, but the genweb-dns-op list is even more limited in its scope. See if you want to know its purpose. Maybe you are looking for a genweb-dns list? There isn't one. -- Anders Andersson, Dept. of Computer Systems, Uppsala University Paper Mail: Box 325, S-751 05 UPPSALA, Sweden Phone: +46 18 183170 EMail: andersa@DoCS.UU.SE Saturday, April 29, 1995 11:53:03 AM GenWeb Item From: Anders Andersson,andersa@Mizar.DoCS.UU.SE,Internet Subject: RFC 1464 (was: GenWeb - some proposals) To: GenWeb Mickey Lane writes in response to me: >>Second, I'd suggest allowing RFC 1464 to specify the syntax for our >>TXT records, meaning that an equals sign (=) should be used instead >>of a colon (:) to separate attribute name and value. > >Could you expand on this? RFC 1464 is a formal proposal to define a standard syntax for use of DNS TXT records to store arbitrary data. It's not specific to WWW URLs or anything. It just allows you to associate a string value with an attribute name of your choice, such as MLANE IN TXT "current project=ROOTSBOOK" IN TXT "favorite drink=orange juice" or whatever you like. RFC 1464 says nothing about what attributes might be defined; that's up to the application (such as GenWeb). This RFC was written by Rich Rosenbaum at DEC, and the author suggests implementing standard library routines to retrieve such attribute values. The point of following RFC 1464 is that we can use the same library routines, once they exist. >The ROOTSBOOK web server currently uses "http://..../GenWeb.exe?BBBB:NNN" >where BBBB is an alphanumeric {database|book} name and NNN is an entry >number. > >Would you suggest something different? Now would be a real good time to >do so... :-) I have no specific suggestion for you. That particular kind of URL would fit in a DNS TXT record (as the value of some attribute) without problem. RFC 1464 employs a quoting mechanism to allow any printable ASCII character anywhere in the attribute names and values, including space ( ), equals sign (=), and double quote ("). We haven't settled the issue of parameter substitution yet, which may affect your choice of BBBB:NNN as database and entry identifiers. If we implement sufficient flexibility in the TXT record URL templates, you could go on using this syntax or any other syntax you like. On the other hand, we may stick to a simpler scheme, requiring you (and others) to follow a given syntax. Please tell us what *you* want. -- Anders Andersson, Dept. of Computer Systems, Uppsala University Paper Mail: Box 325, S-751 05 UPPSALA, Sweden Phone: +46 18 183170 EMail: andersa@DoCS.UU.SE Saturday, April 29, 1995 4:25:12 PM GenWeb Item From: Martin van Keulen,a.l.h.j.vankeulen@student.utwente.nl,Internet Subject: Re: Living people's vital statistics...ethics? To: GenWeb In the Netherlands, there exists a law (Wet Persoonsregistratie) that applies to all data on humans. In short, this law says that you may not disclose information (to third parties) in ones possession without prior consent of the person(s) involved. As a consequence, information search from 1920 till now through official channels is very limited. An example of this law: I myself am registered as a student at the University of Twente. The university decided about a year ago that all data available (adress, telephone number, study) on the persons registered (students, staff, etc.) would become available through a gopher service.Thereupon I received a letter which stated the purose of the university, notifying me what was about to happen. If I objected this use of this personal information, I could have prevented the use. In my opinion, this is also the correct procedure for genealogical information: First, you ask a persons permission to add them to your database. Second, if you are "publishing" this database, you notify the living of your intentions If they object, you comply to their wishes. This way, you are doing things to the best of your ability, with an open mind to the people affected by your actions. > From: Jon Grantham >> Subject: Living people's vital statistics...ethics? >> Reply-To: grantham@math.uga.edu >> >> Hi. I am just starting to put my family's genealogy on my Web page, >> and I have a few questions of a different nature than the interesting >> technical questions that have been filling the list of late. >> >> 1) Should I avoid putting the birth dates of living people on the Web >> w/o their permission? Some companies, unfortunately, use a >> birthdate as a "password" over the phone, and I don't want to make >> life difficult for them. Some people also don't like being wished a >> happy birthday. >> >> 2) Should I omit the birth year? It might be reasonable to say that >> I don't have a right to announce to the world, say, that my Aunt Pam >> is 46 (whoops, sorry Aunt Pam :-) if she didn't want that known. >> >> 3) Should I omit living people altogether without their permission? >> Presumably anyone doing research could trace back to a deceased >> individual and then contact me if they want information about living >> relatives. On the other hand, I conceive of my web offering as more >> robust than just a way to exchange vital stats on people. It's the >> story of my family (or it will be), and I'd like to tell people for >> example, that my cousin Rich served in the Marines or my Uncle Terry >> is going back to school to become a teacher. >> >> >> Jon Grantham >> grantham@math.uga.edu > >This is an interesting dilemma that I ran into with my mother-in-law. >She was 5 years older than her husband and back when she was married >there was some controversy (I am told) with her husband's mother >about him marring an older woman. Well she "cheated" on her >birthdate by 3 years, making herself younger. This was perpetuated >in lots of documents, including her driver's license. > >Of course when I was doing the genealogical research on birth and >baptismal records, the truth came out. She was aghast and made me >promise to use the same date in my files (that I sent to all the >cousins, etc.) that she had always been using. Naturally I felt >uneasy but complied. Now that she has passed away (bless her soul) I >am maintaining the correct date. > >In summary, this is not an easy question to answer and probably needs >to be handled on a case basis. This issue is related to the whole >question of not releasing the federal census records until some 50, >60, or 70 years after the census. >Have a nice day! Jim Higgins >Phone: 516-282-2432 >Fax: 516-282-4900 > Martin van Keulen phone: +31-53-89 50 46 fax: +31-53-89 40 97 e-mail: a.l.h.j.vankeulen@student.utwente.nl KIVI@student.utwente.nl Saturday, April 29, 1995 5:26:42 PM GenWeb Item From: Martin van Keulen,a.l.h.j.vankeulen@student.utwente.nl,Internet Subject: Re: GenWeb - some proposals & stuff To: GenWeb To: cwg@DeepEddy.Com (Chris Garrigues) From: a.l.h.j.vankeulen@student.utwente.nl (Martin van Keulen) Subject: Re: GenWeb - some proposals & stuff >At 4:47 AM 4/28/95, Martin van Keulen wrote: > >> Flame me privately, but I feel the need to reply to this: > >I'm not sure I actually disagree with anything you said. >> Being a "young dog" myself, I've grown up with the Macintosh "touch and >> feel" ;-). The World Wide Web combines (part of) this concept with the >> computing power of Unix. The choice of my new generation lies with these, >> the so-called Graphic User Interfaces. > >In the context I was writing, on the other hand, "older" meant people like >Phlete who were weaned before computers even existed and only recently >started using them. In genealogy forums, many of them are retired people >who bought a computer to support that particular hobby, and are interested >in little more than using the computer to support the hobby. These folks >are almost classic examples of who Apple was talking about when they dubbed >the Mac as "the computer for the rest of us". "Younger" means people like >you or I who started using computers in their teens or even before, and >can't imagine living without them. > >What I was attempting to say is that in developing for the genealogy >market, you can't afford to ignore the large fraction of that market who >are not computer wizards. This does not mean that eventually those folks >won't all die off, but it does mean that it would be nice to make it as >easy as possible for those folks to get their data into the system so that >it's easy for our generation to use their musty old genealogy notes. We seem to mean about the same thing, but phrase it differently.. The Web puts a graphic shell around the information placed on computers. In my vision, my enealogical information is accessed through a tree, posiibly with iconized photographs. Adding, viewing and editing information is possible through the graphical shell. And within our reach this will mean hard work, but it's not impossible. For the older generation of (amateur) genealogists, this may sound awkward. I have roamed the genealogical software and not found anything entirely satisfactory. The recent discussions even show that GEDCOM isn't the standard I believed it be. Being accustomed to such a (primitive) environment, the older generation will most likely stick to their own system, be it written on paper or on computers. Software (technology) nowadays has its emphasis on the user-interface. I always translate this into: a child must be able to operate it. To avoid developing an empty shell we must use the information available today, but we must not be blinded by the methods used to arrange this information. First of all, a standard format has to be defined (based on GEDCOM?) with which the work can begin. After that, as people grow accustomed to the new way, people will begin to translate to this standard. As I perceive things now, this is where you see your market :-), due to the variety of programs now in use. Unfortunately, I haven't dug into the genealogical problems yet, so I rely on what is said in discussions on this list. Martin van Keulen phone: +31-53-89 50 46 fax: +31-53-89 40 97 e-mail: a.l.h.j.vankeulen@student.utwente.nl KIVI@student.utwente.nl Sunday, April 30, 1995 7:22:50 AM GenWeb Item From: Mickey Lane,MLANE@csi.compuserve.com,Internet Subject: Namespace proposal To: GenWeb A proposal for naming linked databases. A name is composed of four elements. * The common name * The version * The serial ID * The file extension F'instance: LANE01$IR$1.GDB The $ seperates the elements. LANE01 is the common name. It's what appears in the user displays as a database or book name. As time goes by, it may refer to different versions of the same database but it always refers to the same general group of people. The IR is the version element. IR = Initial Release. U1 = Update 1. etc. The last element in the name, 1 in the example, is a serial number given to each name accepted into the namespace. No two are the same. The extension defines the file type. GDB is a ROOTSBOOK database, GED is a GEDCOM database and so on. The serial number has an important role even though it uniquely identifies the database just a the name/version elements do. I'll explain in ROOTSBOOK terms since that's what I'm familiar with and as far as I know, is the only working example of this concept. One of the subsystems within ROOTSBOOK is the books service. This consists of a series of routines that manage the books. The API consists of calls to load known books, find a book by number, find one by name, add one, reorder the list, etc. Books are stored within the running program as a linked list of structs, each struct containing a variety of information. Currently, when ROOTSBOOK fires up, it loads all known books from a file and numbers them in the order it reads them. The actual number values are unimportant. As the software runs, references are made between people by a book_number/entry_number number pair. When the idea of multiple name servers and database servers pops up, the numbers can no longer be arbitrary. Book 7 on one server is not the same as book 7 on another. The third element in the name resolves this problem. (*) Once a file is assigned a name, the file never changes. When changes are required, a new file is published with the same common name, a new version element and a new serial number. Links within the body of a database are made using only the common name. Perhaps a special record in the db header or a local table contains the reference to the specific version of the common name in use. Mickey. (*) Thinking back to my message yesterday where I described two servers, one for name and one for entries, this unique number system could do away with the name server, at least from the application programmer's point of view. The client could ask for entry 807/21 and the entry server would go find the host of database 807 if it didn't have it. Speaking of my message yesterday, why was I met with a half-dozen unsubscribe messages this AM? :-) Do I talk to much? Sunday, April 30, 1995 7:29:46 AM GenWeb Item From: Mickey Lane,MLANE@csi.compuserve.com,Internet Subject: re: Namespace proposal To: GenWeb An addition to my Namespace proposal: >The IR is the version element. IR = Initial Release. >U1 = Update 1. etc. Use R to reserve a common name so work could be done prior to release. This would mean that a common name and a serial number could be assigned prior to the existance of the data. Once the data was available, the version element of the name would change to IR. When a file becomes obsolete, change the name by prefixing the version with O. "LANE01$IR$nnn.GED" would become "LANE01$OIR$nnn.GED" when "LANE01$U1$mmm.GED" became available. Mickey.