Utility ~ Census CSV to GEDCOM Conversion

Description

A short program for converting census data from a CSV file (as can be produced by EXCEL) into a GEDCOM file. The Zip file contains the program itself, and an Adobe PDF document describing what it does and the requirements it has for the order of data, as reproduced below.

Produced by peterjward (13/03/10)

Introduction

You may have England and Wales census information transcribed into a spreadsheet such as Excel. The rows usually represent individuals and the columns data items about them. With most database and spreadsheet programs it should be possible to export this data into a CSV file. This is a text based file format where the data is arranged with each item of text information (from a spreadsheet row) separated from the next by a comma, and the end of each row terminated with a carriage return/linefeed. It is important however that your data does not itself contain any commas anywhere as this can disrupt the file produced – if necessary replace with semicolons(;), colons(:) or periods (.). Neither should it contain double quotes (“) as these are used by this program during processing for delineating strings.

The program (written in Visual Basic) will input this information and produce a GEDCOM file (of the same name as your CSV file) that can be opened and read by any genealogical program (such as Family Historian) with such an ability. This information is concerned with individuals only, and does not connect them into families – this you will have to do yourself. It does recognise married or widowed women as such and adds an asterisk (*) to the end of their surname to indicate that it is not a maiden name. This is easily removed later if you do not want it using a word processor that can deal with text files (such as Wordpad, Notepad or Word). It does not do anything to distinguish married men.

Details

It creates a record for each individual with name and sex. It will calculate an approximate birth year or date from the given age: It recognises ages in days, weeks and months (e.g. 7m, 12d, 5w), weeks being converted to months; anything else is interpreted as years. It will add a place of birth if this is present, using just the county of the census if the year is 1841. A census event is generated with appropriate date, location and address (if known), plus age: The relation to the head of the household is also added as a note to the event (except 1841 of course). The marital status is not recorded in any way except for wives and widows as indicated above. Towns and counties are concatenated for places of birth and census (e.g. “Guildford” and “Surrey” will become “Guildford, Surrey”).

A source reference is generated from the piece/folio/page information so that it represents one page of a census return. This reference is appended to the census event and to the individual record as a whole.

It requires all the input to be formatted as text (even dates and numbers), and in the correct order, which differs somewhat for the earliest (1841) and latest (1911) censuses compared with the rest. You should also ensure that your data is sorted by piece, folio, page and schedule before creating your CSV file as this will ensure that you get an orderly set of source names. It can deal with a mixture of pieces for any one year, but can only manage one year at a time.

A certain amount of error trapping has been done and missing items of data (blank spaces) should only mean corresponding blanks in the resulting GEDCOM file. I have tested this fairly thoroughly in Windows XP and VISTA without any problems but have not tried it with other operating systems. I originally designed it for my own use and I can (and will) give no warranty. However it will not delete your original files, merely produce an additional new one (but will overwrite an existing GEDCOM file of the same name). If it should crash the reason will be that your CSV file has not been read properly – you do not have the correct number of data items for any individual or else there is probably a stray comma (,) or double quote (“) which leads to the program misreading the items of data; you will then almost certainly get an error message. And if you do not have the data items in the right order then the GEDCOM produced will be garbage.

The program should be placed in the same folder as your CSV file. The GEDCOM file produced will appear in the same folder. You need to tell it the year in question and the name of your CSV file (case insensitive, and leave off the .csv file suffix). Navigation round the three ‘controls’ on the form is by mouse or tab key. The year selection is made by using the direction keys, filename entry by typing or copy and paste. When GO is selected it is activated by pressing the space bar or left clicking the mouse. The program gives no sign that it has finished processing – so just check the folder for the new Gedcom file. A fair-sized file (3000 individuals) is processed in less than a second. You may process any number of CSV files at any one time by simply entering the new filenames and years. The program is closed like any window by clicking on the small X box in the top right hand corner.

The data order required is as follows:

1841 Census

Census county The county of the Census ( NOT where born)
Census place The town/village/hamlet
Address Often not present
Piece number The number immediately after HO107
Book number Larger pieces may have several books
Folio number At top of alternate pages
Page number Found at the top of each page
Forename Obvious – no separate column for middle names/initials
Surname Also obvious
Sex Simple M or F
Age Days/weeks/months recognised and treated appropriately
County Born Simple Y or N – no notice taken of I(reland) etc

1851 to 1901 Census

Census county The county of the Census ( NOT where born)
Census place The town/village/hamlet
Address House / street/ etc,
Piece No. The number immediately after the RG9, RG10 etc
Folio No. The number following the piece no.
Page No. The number following the folio no.
Schedule No. The number identifying the household
Forename(s) Include middle names or initials
Surname Will accept double-barrelled names
Relation Head / Wife / Son / Servant / Boarder / Lodger etc.
Status Married / Widowed / Single (or unmarried)
Sex Simple M or F
Age Days/weeks/months recognised – anything else means years
Birthplace Town etc
Birth County May be foreign country

1911 Census

Census county The county of the Census ( NOT where born)
Census place The town/village/hamlet
Address House / street/ etc,
Piece Comes immediately after RG14 in census reference
Registration Dist Number immediately after RD in Census reference
Subdistrict Number immediately after SD in Census reference
Enumeration District Number immediately after ED in Census reference
Schedule Number immediately after SN in Census reference
Forename(s) Include middle names
Surname Will accept double-barrelled names
Relation Head / Wife / Son / Servant / Boarder / Lodger etc.
Marital Status Married / Widowed / Single (or unmarried)
Sex Simply M or F
Age Days/weeks/months recognised
Birthplace Town of birth
Birth county County of birth – sometimes foreign country

Screenshot

CSV to GEDCOM

Installation Instructions

Click on the link below to download and save the csvtogedcomcensusimport.zip file.

This Zip file contains the program and the above PDF information. Simply extract the two files to a folder.

Rearrange the data into the order required for the census year in question as above. Make sure you have removed or converted commas and double quotes and then save as a CSV (Comma Separated Values) file, and remember the file name. Move this into the folder with the program and run the program by simply double clicking on it (you do not have to have Visual Basic installed on your computer).

The Census CSV to GED.exe file may need right-click > Properties > Unblock to allow it to work correctly.

If you receive an error message saying MSVBVM50.DLL is missing, then download the Visual Basic 5.0 Run-Time Files from http://support.microsoft.com/kb/180071 and install as a Windows Administrator.

Download