* How to automate 1901 Census data collection

News and Announcements from the Family Historian World
Post Reply
avatar
philwarnorpkent
Gold
Posts: 26
Joined: 27 Mar 2007 14:21
Family Historian: V6
Contact:

How to automate 1901 Census data collection

Post by philwarnorpkent » 19 Nov 2002 15:59

The 1901 Census and ways to improve your results

There have been many thousands of words passing through the many county lists run by Roots Web about accessing and using the 1901 Census on line, courtesy (right word?) of the Public Record Office (PRO) at Kew and Qinetiq (which rejoices in many variants like Kinky and Qineptiq and Qukup). Many are critical of the service provided, many just grateful that for the last six weeks' or so (I am writing this in mid October, 2002) one has been able to access the data, after that major crash in January 2002! Words like fiasco are probably correct but I herein do not want to place any blame nor cast any aspersions!

What I am trying to do here are several things:

1) Suggest how to use it best

2) Indicate what problems one might find accessing the data

3) Indicate a few of the many errors in the data and the indexes

4) Be generally helpful to my fellow family historians

OK, first the basics. Access is simple, given a computer and a link to your ISP. Just go to the URL http://www.census.pro.gov.uk/

and enter data into any of the search formats provided.

I suggest that you just play with searches for a while to get a feel for the site. If you do a blanket search using just the surname field, the first thing you will discover is that unlike the 1881 CD set, the BVRI (version one or two) set, of CD's that there is a limit to the number of people you are allowed to bring up on any one search. That limit is 300 souls. I suggest that for *ANY* search, that you set the number of persons displayed per page to 30.

Like the 1881 Census transcribed by dedicated members of Family History Societies, there are errors in both the data and the indexing of that data. My feeling is that the errors are more than in the 1881 but I cannot prove that. Rude words as surnames abound, but as this will be read by the young, my lips are sealed. Similarly, glaring errors in spelling of county and town and village names abound. Places are removed into distant counties too. So, my advice to you, if you cannot find a person by an exact spelling of his or her name, is to use wild cards. Now, unfortunately, to save search time I suspect, the consortium has decided to restrict the use of wild cards to the third or subsequent places in a name, a place, a county, etc. The wild cards you can use therefore are the underline (or break character _ ) which replaces just one character and the asterisk (*) character which stands in for one or more characters.

So, if I am searching for WARN or for WARNE I can do a WARN* search. If I want to find all Smiths in Gloucestershire, I can put GLO* in the appropriate field. This brings up MOST of the Gloucestershire SMITH entries. Since there are at least 20 variants of GLOUCESTERSHIRE, it helps to use the wild card all the time! Now I have already mentioned the limit of 300 and that begs the question of how do we remove that limit?

The first way is to put MALE in for one search and FEMALE for a second. That way we can get up to 600 souls. BUT, I can hear you asking, how do I find more? Well, my friends, you start to use the two age fields. If we put 3 +- 3 in the two fields, we can restrict our search to children aged from 0 to 6 (almost, more later!). Now if that fails too, we can put in 0 and leave the second field blank. That brings up all aged from 0 to 364 days old. If *that* fails then another approach has to be adopted. In this way we can get up to 600 souls in each year age grouping. If any of those fails then we might have to adopt a town by town search and that is really dodgy!

In this way I have extracted 21,676 souls with the surname HOLT, one of my researched surnames. It can be messy, but so far using single year searches, male and female separately, I have not find more than 300 male or 300 female HOLT souls in any year grouping. [In parentheses, do not use this method for SMITH]

In a One Name Study or like, there *could* be more than 300 souls in one age year, despite breaking down them into the two gender sub-searches. I mentioned doing a town search before, but a county search first might overcome this 300 soul limit. If you are singularly unlucky and find too many in a county, and you *know* that they are mainly centred in a few towns or villages, then do a town/village search and get them this way. Of course, this is only necessary for only the age groups which hit the 300 soul limit. Do not do it for all age groups!

There are a couple of programs that help mechanise the searches. They are Census Manager (CM) and Census Extractor (CE). Each has its advantages and its disadvantages. CM can save the data as Comma Separated Variables (CSV) files and in GEDCOM. CE can save the output as CSV or as an Excel (97 or 2000) worksheet, (*.XLS). By the time you read this, I will have created a mechanism to link the two programs. A visit to my web site would be an advantage or just email me at philwarn@ntlworld.com - I am quite happy to answer any queries that you might have with any aspect of this subject.

There seem to be a few 'funnies' about ages. Children of 18 months age (1 1/2) seem to be omitted from the results of a search. Do not ask why but a search for 200 +/- 99 seems to find them. Thanks to John Holden of the SoG for that one! The other problem seems to be when there are thousands of one surname and you do a search for a male aged 1, say, and it fails with the 'too many persons' error.

If you know that Fred Bloggs *should* be at Brighton, Sussex and a search fails to find him, remove the county qualifier and you find him in Brighton, Essex!! We can put that down to transcriber error, my friends! Similarly, use wild cards in town and county names to try to overcome transcriber or enumerator error. Never blame the messenger, the transcribers were to told to type what they saw. Why there was not 100% double keying and checking, I shall never know. (Penny pinching petty officialness?) Similarly, certain night workers like miners, dockers, doctors, nurses, etc will not be there. The rules rule them out!!! That has been true for all known censuses, Phil goes out on a limb and advers! Similarly if you are looking for your great grand father and you find his wife down as Married and Head of House, he might be around the corner with his mistress. The King is in that boat! King Edward VII is down at Windsor Castle as Head of family and married and is listed as Edward R & I (Rex and Imporatum, King and Emporer) but his good Queen Alexandra is not there, kicked out for poor Alice Keppel, I wonder, as Prince Louis Battenburg is there too, with *his* family including the late Lord Louis Mountbatten! No, Alice Keppel, is not listed at Windsor, unless she went by a nom de plume. We *all* know that Alice was *friendly* with both Bertie and Prince Louis, don't we?

I mentioned CM and CE earlier. They are available for free download from the following URL's, CE is totally free, CM is shareware for a mere £10 registration fee after the usual 30 days trial. I have no connection with either author, save that CE is written in my birth town of Leeds, and CM is written by a colleague of 25 years' ago, Keith Sheffield. Neither pays me for plugging his products!

1) CE http://leedsindexers.co.uk/Internet_Tools.htm
2) CM http://uk.geocities.com/kgnsheffield/Ht ... nsus1.html

3) My web pages start here, too: http://members.lycos.co.uk/philwarnorpkent/

Finally for your edification, I list the number of variants of GLOUCESTERSHIRE, I have found so far, in the 1901 Census. I suspect it will not be a complete list!

1) GLOUCESTERSHIRE

2) GLOS

3) GLORSH

4) GLOUCESTER

5) GLOSTER

6) GLOSHIRE

7) GLOSTERSHIRE

8) GLOCESTERSHIRE

9) GLO SHIRE

10) GLOUCS

11) GLOSE

12) GLUCEST

13) GLOSTR

14) GLOSHRE

15) GLOUCESTERSH

16) GLOUCESTERS

17) GLOR

18) GLORS

19) GLOUCESTESHIRE

20) GLOSTERSHIER

21) GLOSTERE

22) GLOSTERS

23) GLOUT

Phil. Warn 15:50 Tuesday 19 November 2002



[grin]

avatar
anyquist
Newbie
Posts: 1
Joined: 21 Nov 2002 07:54
Family Historian: None

How to automate 1901 Census data collection

Post by anyquist » 21 Nov 2002 08:04

I found this most informatory especially for a novice in this area.

avatar
Geoffers
Newbie
Posts: 1
Joined: 19 Nov 2002 11:01
Family Historian: V6
Contact:

How to automate 1901 Census data collection

Post by Geoffers » 22 Nov 2002 19:02

A couple of other things.....
1) The ages on the PRO website can be slightly out too, I've found a few people recorded as over 300 years of age - no wonder I couldn't find their burials anywhere.
2) If anyone is searching for family in Norfolk, some of us have put transcriptions of several places on a
http://www.genealogy.doun.org/transcriptions/index more will be added with time, it might help you confirm details before dowloading images from the PRoO site.

avatar
davepacey
Famous
Posts: 135
Joined: 22 Nov 2002 19:00
Family Historian: V7
Location: Lincolnshire, UK

How to automate 1901 Census data collection

Post by davepacey » 25 Nov 2002 22:31

I have been using the method for quite a while. I am using 1901 Census Extractor mentioned above. You have the option within this program to delete duplicates, but it will only do this for records where every field is duplicated. I find the best way is to export them to csv file, then import them into MS Access, with the person id field set to indexed, no duplicates. This will then only paste unique records, the failures will be pasted into a table called import errors. You can then examine the records in this table to see the obvious errors, such as widowed males aged under 15 say and married females or males aged under 14 and so forth. On my searches of the MOODY surname for instance, I found more than 300 children under 10, listed as married. I dont think for one minute, that they are in the transcriptions as married. More likely a bug in the search engine.

Well done for a good informative article

Dave Pacey

avatar
davepacey
Famous
Posts: 135
Joined: 22 Nov 2002 19:00
Family Historian: V7
Location: Lincolnshire, UK

How to automate 1901 Census data collection

Post by davepacey » 25 Nov 2002 22:33

I ought to mention that for the 300 entries listed as married, the same people were also listed as single, same goes with under aged widows etc.

avatar
admin
Famous
Posts: 245
Joined: 30 Aug 2013 07:52
Family Historian: V6
Contact:

How to automate 1901 Census data collection

Post by admin » 20 Dec 2002 12:57

I just searched for Taubmans in Liv*, for Liverpool. One problem Liverpool on most records is shown as Lpool!

User avatar
dbridge276
Platinum
Posts: 34
Joined: 16 Jun 2003 20:15
Family Historian: V7
Location: Rayleigh, Essex, England
Contact:

How to automate 1901 Census data collection

Post by dbridge276 » 18 Jun 2003 20:09

I would endorse the recommendation to use the 1901 Census Extractor & GuessTimator (http://www.LeedsIndexers.co.uk) the save each page is brilliant as is the sort and export.

Once you have created this long list you can sort of the PersonID and find family groups very easily.

Post Reply