* Occupations
- DavidNewton
- Superstar
- Posts: 462
- Joined: 25 Mar 2014 11:46
- Family Historian: V7
Occupations
I am thinking of using the Occupation Descriptor, as well as the Occupation description, and my thought is to split my occupation data into 'Occupation' and 'Descriptor'. So I would welcome opinions as to how this should be done, or even if it should be done..
At this moment I am thinking along the lines: 'Coal miner, pony driver underground' becomes Coal Miner (Descriptor: Pony Driver Underground). The advantage of this is that in the Working with Data > Occupations I would get overall numbers easily. The disadvantage of course is that the fine detail becomes almost invisible but it does show in the custom query 'Occupations List'
http://www.fhug.org.uk/wiki/doku.php?id ... ccupations
I have searched the forum for Occupation Descriptor and came up with just one hit
http://www.fhug.org.uk/forum/viewtopic. ... tor#p18846
in which the use of the descriptor is discussed.
David
At this moment I am thinking along the lines: 'Coal miner, pony driver underground' becomes Coal Miner (Descriptor: Pony Driver Underground). The advantage of this is that in the Working with Data > Occupations I would get overall numbers easily. The disadvantage of course is that the fine detail becomes almost invisible but it does show in the custom query 'Occupations List'
http://www.fhug.org.uk/wiki/doku.php?id ... ccupations
I have searched the forum for Occupation Descriptor and came up with just one hit
http://www.fhug.org.uk/forum/viewtopic. ... tor#p18846
in which the use of the descriptor is discussed.
David
- tatewise
- Megastar
- Posts: 27089
- Joined: 25 May 2010 11:00
- Family Historian: V7
- Location: Torbay, Devon, UK
- Contact:
Re: Occupations
As you say there are problems with the visibility of the Descriptor sub-field.
An alternative is to use an unusual character such as tilde (~) between the two parts of the Occupation.
e.g. 'Coal miner ~ pony driver underground'
Then a Plugin similar to my Occupations Per Census Year and Gender could produce a Result Set of counts for just the part of the Occupation preceding the tilde (~).
An alternative is to use an unusual character such as tilde (~) between the two parts of the Occupation.
e.g. 'Coal miner ~ pony driver underground'
Then a Plugin similar to my Occupations Per Census Year and Gender could produce a Result Set of counts for just the part of the Occupation preceding the tilde (~).
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
- DavidNewton
- Superstar
- Posts: 462
- Joined: 25 Mar 2014 11:46
- Family Historian: V7
Re: Occupations
Mike
I was unaware of your plugin. I have downloaded it and will now spend some time trying to follow how it works and then, I hope, modify it to use your idea of the ~ separator, as it solves both my issues simultaneously.
David
I was unaware of your plugin. I have downloaded it and will now spend some time trying to follow how it works and then, I hope, modify it to use your idea of the ~ separator, as it solves both my issues simultaneously.
David
- DavidNewton
- Superstar
- Posts: 462
- Joined: 25 Mar 2014 11:46
- Family Historian: V7
Re: Occupations
Excellent. Your idea of the separator works really well and following your plugin and drastically reducing the amount of data collected I was able to write a simple plugin to just count the number of individuals who recorded a particular occupation and discard duplicate mentions of the same occupation. I twigged while writing the plugin that in the Working with Data each mention of an occupation is counted whereas in the Occupations List Query only the first occupation that an individual lists is mentioned. Your ~ idea means that as they move tasks within the industry they are still counted as coal miners.
All that is left is to go through all the occupations data and edit it into an appropriate format. Many Thanks
David
All that is left is to go through all the occupations data and edit it into an appropriate format. Many Thanks
David
- jimlad68
- Megastar
- Posts: 911
- Joined: 18 May 2014 21:01
- Family Historian: V7
- Location: Sheffield, Yorkshire, UK (but from Lancashire)
- Contact:
Re: Occupations
David, any chance of publishing (work in progress) or inserting a copy of your amended code in this post, will save reinventing the wheel and/or help others with future coding.
Jim Orrell - researching: see - but probably out of date https://gw.geneanet.org/jimlad68
- DavidNewton
- Superstar
- Posts: 462
- Joined: 25 Mar 2014 11:46
- Family Historian: V7
Re: Occupations
Jim
I am not intending to publish this so work in progress is not accurate. However. I am attempting to attach the plugin file. Please excuse my non-standard choices of variable names - I am not a programmer.
I am not intending to publish this so work in progress is not accurate. However. I am attempting to attach the plugin file. Please excuse my non-standard choices of variable names - I am not a programmer.
- Jane
- Site Admin
- Posts: 8442
- Joined: 01 Nov 2002 15:00
- Family Historian: V7
- Location: Somerset, England
- Contact:
Re: Occupations
Hi David, just a thought you could use match rather than find and that way you can extract the string in one line, it's worth getting to grips with match and gsub as they are really useful.
For example
This will grab all the characters up until the first space ~ combination and if that's nil return the original string.
So your function would reduce to
For example
Code: Select all
str = str:match('(.-) ~') or strSo your function would reduce to
Code: Select all
function ExtOccu(str)
return str:match('(.-) ~') or str
end
Jane
My Family History : My Photography "Knowledge is knowing that a tomato is a fruit. Wisdom is not putting it in a fruit salad."
My Family History : My Photography "Knowledge is knowing that a tomato is a fruit. Wisdom is not putting it in a fruit salad."
- DavidNewton
- Superstar
- Posts: 462
- Joined: 25 Mar 2014 11:46
- Family Historian: V7
Re: Occupations
Thanks Jane
In the book I have been using string.find comes first and did the job, also generally I didn't have to worry too much about patterns.
I'm trying to work out the pattern you suggested. Is this correct?
the () means grab the string within; '.-' means 0 or more characters minimally expanded until it reaches the two characters ' ~'
So if I wanted to grab the space as well, ensuring that it is a space, would the pattern be '(.- )~' and if I just want to grab everything up to the first '~' would it be '(.-)~'
David
In the book I have been using string.find comes first and did the job, also generally I didn't have to worry too much about patterns.
I'm trying to work out the pattern you suggested. Is this correct?
the () means grab the string within; '.-' means 0 or more characters minimally expanded until it reaches the two characters ' ~'
So if I wanted to grab the space as well, ensuring that it is a space, would the pattern be '(.- )~' and if I just want to grab everything up to the first '~' would it be '(.-)~'
David
- tatewise
- Megastar
- Posts: 27089
- Joined: 25 May 2010 11:00
- Family Historian: V7
- Location: Torbay, Devon, UK
- Contact:
Re: Occupations
Yes, exactly correct David.
The match function has a further trick.
If you also want to capture the string after the ~ then use:
While editing your Plugin, click its Help > Lua Online Reference Manual to display the reference manual in your browser.
For details on this scenario scroll down the index and click on one of the following:
5 – Standard Libraries
5.4 – String Manipulation
5.4.1 – Patterns
There is a neater method for updating your counter for Occupations.
The logic is that the Individual Occupation entry is nil (i.e. false) until the Occupation sets it to true.
If the Individual Occupation entry is false (i.e. not true) then the count is initialised to 0 and incremented.
So your two if statements become:
For more advice see the plugins:index|> V5 Plugins Developer Guide section.
The match function has a further trick.
If you also want to capture the string after the ~ then use:
Code: Select all
before, after = str:match("(.-)~(.*)")
For details on this scenario scroll down the index and click on one of the following:
5 – Standard Libraries
5.4 – String Manipulation
5.4.1 – Patterns
There is a neater method for updating your counter for Occupations.
The logic is that the Individual Occupation entry is nil (i.e. false) until the Occupation sets it to true.
If the Individual Occupation entry is false (i.e. not true) then the count is initialised to 0 and incremented.
So your two if statements become:
Code: Select all
if not IndiOcc[strOcc] then -- Occupation not found yet for this Individual
IndiOcc[strOcc] = true -- Record that Occupation is found
Occup[strOcc] = ( Occup[strOcc] or 0 ) + 1 -- If count is nil then use 0 and finally add 1
end
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
- DavidNewton
- Superstar
- Posts: 462
- Joined: 25 Mar 2014 11:46
- Family Historian: V7
Re: Occupations
That is much neater and the logic is still transparent - a very important factor. I have a tendency to include too many comments but I do not want to fall into the trap described below. The remainder of this post is not relevant to Occupations but is relevant to Documentation
Many years ago I read a book "The Mathematical Experience" by Philip Davis & Reuben Hersh. In one of the chapters they are describing the 'Ideal Mathematician'. Let me quote two short passages about the ideal mathematician, I have left out some of the text.
Many years ago I read a book "The Mathematical Experience" by Philip Davis & Reuben Hersh. In one of the chapters they are describing the 'Ideal Mathematician'. Let me quote two short passages about the ideal mathematician, I have left out some of the text.
However,"To his fellow experts he communicates his results in a casual shorthand. If you apply a tangential mollifier to the left quasi-martingale you get an estimate better than quadratic..."
"His writing follows an unbreakable convention: to conceal any sign that the author, or the intended reader, is a human being. ... The intended readers (all twelve of them) can decode the formal presentation....and see what the author is doing and why he does it. But for the noninitiate this is a cipher that will never yield its secret