* Find Duplicate Place Names
-
quarlton
- Famous
- Posts: 150
- Joined: 26 Feb 2004 13:07
- Family Historian: V7
- Location: Lincolnshire
- Contact:
Re: Find Duplicate Place Names
Hi Mike
The penny has dropped now with the Newton issue!
Oh well, at least the exercise has given my brain a workout.
Thanks for all your help.
Dave
The penny has dropped now with the Newton issue!
Oh well, at least the exercise has given my brain a workout.
Thanks for all your help.
Dave
Dave Simpson ~ Boulton, Braham, Carney, Simpson and Jacobs
Re: Find Duplicate Place Names
one might ameliorate the situation with a table saving the pattern for places, and the pattern for addresses
and use the patterns to standardize place and address records and run as pattern thru a function.
and use the patterns to standardize place and address records and run as pattern thru a function.
FH V.6.2.7 Win 10 64 bit
- tatewise
- Megastar
- Posts: 27077
- Joined: 25 May 2010 11:00
- Family Historian: V7
- Location: Torbay, Devon, UK
- Contact:
Re: Find Duplicate Place Names
I don't understand how that would work Ron.
Can you give an example of a pattern for Place names that would work for the names 1) to 5) that I posted on Wednesday.
Can you give an example of a pattern for Place names that would work for the names 1) to 5) that I posted on Wednesday.
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
-
quarlton
- Famous
- Posts: 150
- Joined: 26 Feb 2004 13:07
- Family Historian: V7
- Location: Lincolnshire
- Contact:
Re: Find Duplicate Place Names
Mike
Thanks to your rewriting of the Tidy function, I believe that I can streamline the procedure a bit.
Instead of using my function to test if a place name needs tidying, then using the Tidy function, I could simply call the Tidy function with each place name.
Then if the result is changed from the original I have carried out the test and correction in one go.
Ron
Sorry but I haven't managed to get my head around your suggestion.
It may be a bit beyond my understanding and capabilities
Thanks to your rewriting of the Tidy function, I believe that I can streamline the procedure a bit.
Instead of using my function to test if a place name needs tidying, then using the Tidy function, I could simply call the Tidy function with each place name.
Then if the result is changed from the original I have carried out the test and correction in one go.
Ron
Sorry but I haven't managed to get my head around your suggestion.
It may be a bit beyond my understanding and capabilities
Dave Simpson ~ Boulton, Braham, Carney, Simpson and Jacobs
- tatewise
- Megastar
- Posts: 27077
- Joined: 25 May 2010 11:00
- Family Historian: V7
- Location: Torbay, Devon, UK
- Contact:
Re: Find Duplicate Place Names
For your interest, the attached Find Duplicate Place Names plugin is my variant that only identifies Place records that will need merging if any universal format were applied.
I don't understand Ron's suggestion either.
I don't understand Ron's suggestion either.
Last edited by tatewise on 25 Apr 2022 15:18, edited 1 time in total.
Reason: Attachment deleted as later version is in the Plugin Store.
Reason: Attachment deleted as later version is in the Plugin Store.
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
-
quarlton
- Famous
- Posts: 150
- Joined: 26 Feb 2004 13:07
- Family Historian: V7
- Location: Lincolnshire
- Contact:
Re: Find Duplicate Place Names
Thanks Mike, I'll have a look at this morning.
Dave Simpson ~ Boulton, Braham, Carney, Simpson and Jacobs
Re: Find Duplicate Place Names
mike, it would only -- ameliorate. its why I used the word.
for example, this program would never work for me, not ever (and many other people, not useful to most.
my place is:
town, township (or district) "[PLCC]", county, state, nation
my address is;
building, address, town, state, zip, nation.
and even those have exceptions
"near" Lefore, , , ND, USA
"1.8 m North of" Mazeppa, , Olmsted, MN, USA
and as you know I am very picky about running them thru MLF successfully, having place, address pairs pass a non-rigorous haversine test, i.e any place address pair including London MUST be within 24 miles of each other, and if that happens, I try to determine if its Mayfair or Whitechapel district and so on to get the numbers closer.
a gazetteer, even if you could steal just the one from toponomy, for England, and work it out into lua an understandable table wont work positively.
but the plug in as it exists, will list every place I have in the file as an exception.
no program can account for every exception or check every atomic for correct positioning, that is a given in any plugin.
somewhere in everyone's family history is a good old No. 10 Boris Johnson.
to have a generalized place address pair of patterns would when checked against tidy and the original, ameliorate the decisions of the plugin to report.
Even with a comprehensive gaetteer,
Nobody, Expects, the, Spanish, Inquisition
may have to be an acceptable place name in form.
for example, this program would never work for me, not ever (and many other people, not useful to most.
my place is:
town, township (or district) "[PLCC]", county, state, nation
my address is;
building, address, town, state, zip, nation.
and even those have exceptions
"near" Lefore, , , ND, USA
"1.8 m North of" Mazeppa, , Olmsted, MN, USA
and as you know I am very picky about running them thru MLF successfully, having place, address pairs pass a non-rigorous haversine test, i.e any place address pair including London MUST be within 24 miles of each other, and if that happens, I try to determine if its Mayfair or Whitechapel district and so on to get the numbers closer.
a gazetteer, even if you could steal just the one from toponomy, for England, and work it out into lua an understandable table wont work positively.
but the plug in as it exists, will list every place I have in the file as an exception.
no program can account for every exception or check every atomic for correct positioning, that is a given in any plugin.
somewhere in everyone's family history is a good old No. 10 Boris Johnson.
to have a generalized place address pair of patterns would when checked against tidy and the original, ameliorate the decisions of the plugin to report.
Even with a comprehensive gaetteer,
Nobody, Expects, the, Spanish, Inquisition
may have to be an acceptable place name in form.
FH V.6.2.7 Win 10 64 bit
- trevithick
- Silver
- Posts: 9
- Joined: 31 Jul 2022 21:43
- Family Historian: V7
- Location: Florida Panhandle
Re: Find Duplicate Place Names
Thank you Mike for this Plugin. It really helps in cleaning up Place that were not corrected by the Convert Legacy Places plugin.
I have two questions, one related to this plugin specifically. Would it be possible to have check boxes, and a select all, buttons for the Tidy Name Places, and then a Select button that would eliminate the time consuming job of Merging?
The second question is directly related to Merging. I've searched, but have not found a way to add the Merge/Compare Records to the Toolbar. Have I missed it, or is it hidden to prevent accidental merges?
I have two questions, one related to this plugin specifically. Would it be possible to have check boxes, and a select all, buttons for the Tidy Name Places, and then a Select button that would eliminate the time consuming job of Merging?
The second question is directly related to Merging. I've searched, but have not found a way to add the Merge/Compare Records to the Toolbar. Have I missed it, or is it hidden to prevent accidental merges?
- tatewise
- Megastar
- Posts: 27077
- Joined: 25 May 2010 11:00
- Family Historian: V7
- Location: Torbay, Devon, UK
- Contact:
Re: Find Duplicate Place Names
It seems you are talking about the Result Set displayed in FH after the Plugin has finished.
Displays like that produced by Plugins are just like the Result Sets produced by Queries.
Proactive buttons such as you have requested are simply not supported in any Result Set.
Results Sets are passive data lists and cannot change anything themselves.
Anyway, it is possible that neither Place record specifies the tidied form of the Place name.
The 'Tidy Place Name' removes redundant commas that may need retaining in the merged Place name.
Alternatively, the duplication may only be due to upper/lower case differences in the Place name.
So, what format should the merged Place Name adopt?
There may also be different content such as Lat/Long, Notes, Media, etc, in the two Place records.
Which of the two Place records and various content fields should be the preferred choice?
Thus it is not clear what any automated merging should do in such circumstances.
That is why the merging is delegated to the user to resolve.
The 'Merge/Compare Records' option is on the Toolbar 'Edit' menu (next to the 'File' menu).
Displays like that produced by Plugins are just like the Result Sets produced by Queries.
Proactive buttons such as you have requested are simply not supported in any Result Set.
Results Sets are passive data lists and cannot change anything themselves.
Anyway, it is possible that neither Place record specifies the tidied form of the Place name.
The 'Tidy Place Name' removes redundant commas that may need retaining in the merged Place name.
Alternatively, the duplication may only be due to upper/lower case differences in the Place name.
So, what format should the merged Place Name adopt?
There may also be different content such as Lat/Long, Notes, Media, etc, in the two Place records.
Which of the two Place records and various content fields should be the preferred choice?
Thus it is not clear what any automated merging should do in such circumstances.
That is why the merging is delegated to the user to resolve.
The 'Merge/Compare Records' option is on the Toolbar 'Edit' menu (next to the 'File' menu).
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry