Page 1 of 1

Extracting census address from source title

Posted: 08 Feb 2020 10:01
by Mark1834
I have a lot of old census records in my database that pre-date my use of FH so do not include the detailed address. All my censuses have single individual citations to a corresponding source record with a title along the lines of "Census, 1881: Bermondsey, SRY (156 Jamaica Road)". Extracting this address data where it is present seems like an obvious job for a small custom plug-in. Unfortunately, it's two years since I last dabbled with writing plug-ins, and I'm very rusty! The following code seems to work ok, judging by source queries run before and after, but does it look reasonable please? I'm very wary of unintended consequences to other data!

I know it would need more refinement to generalise application (error testing, etc), but it needs to run only once on each database so can be relatively quick and dirty as long as it is correct!

Code: Select all

ptIndi = fhNewItemPtr()						-- record pointer
ptIndi:MoveToFirstRecord('INDI')				-- set pointer to first INDI record

while ptIndi:IsNotNull() do					-- loop through individuals
	ptCensus = fhGetItemPtr(ptIndi, '~.CENS')
	while ptCensus:IsNotNull() do				-- loop through their census events
		strAddress = ''					-- reset strings
		strTitle = ''
		strSourceAddress = ''
		strAddress = fhGetItemText(ptCensus, '~.ADDR')	-- census address
		if strAddress == '' then
			strTitle = fhGetItemText(ptCensus, '~.SOUR>TITL')	-- get source title
			strSourceAddress = string.match(strTitle, "%((.-)%)")
			if strSourceAddress ~= nil then				-- create new address field
				ptAddress = fhCreateItem('ADDR', ptCensus)
				fhSetValueAsText(ptAddress, strSourceAddress)
			end
		end
		ptCensus:MoveNext('SAME_TAG')
	end
	ptIndi:MoveNext('SAME_TAG')						-- move to next individual
end
Thanks in anticipation!

Re: Extracting census address from source title

Posted: 08 Feb 2020 11:41
by tatewise
Yes, that looks good to me, given your assumptions:
The CENS events you are interested in have their 1st SOUR Citation linked to an appropriate Source record.
Those CENS events must also have an empty ADDR field.
The Source record TITL must contain matched ( parentheses )

Only when those criteria are satisfied will the ADDR field get updated.

In this case you don't have to worry about whether the ADDR tag already exists or not as FH automatically removes empty ADDR fields, but the following is a slightly more rigorous test:
ptAddress = fhGetItemPtr(ptCensus, '~.ADDR') -- census address
if ptAddress:IsNull() then

Re: Extracting census address from source title

Posted: 08 Feb 2020 14:06
by Mark1834
Thanks Mike - one minor refinement I will add is to flag any examples where an existing address field differs from the text in the source title, but that’s straightforward now I know that the main logic is correct.

Re: Extracting census address from source title

Posted: 08 Feb 2020 15:05
by tatewise
What I like to do despite being a bit fiddly is to add a Result Set that in this case could show every Census Event with its Address and its Source Title for those with the parentheses.

Re: Extracting census address from source title

Posted: 08 Feb 2020 16:10
by Mark1834
Indeed - I thought of something like that, but I haven’t done Results Sets yet so that is a learning exercise for another day :). In the meantime, I have a simple custom query that does more or less the same thing, albeit run as a separate exercise.