Find Duplicate Individuals ~ Frequently Asked Questions

Why are so many candidate duplicate pairs listed that are not duplicates?

The assessment algorithm is not perfect, so it is inevitable that some false matches will get listed. That is why the Omit Non-Duplicates Tab is provided.

First try adjusting the Set Preferences Tab ~ User Interface Tab settings, by raising the Individual Threshold or Results Minimum Score. For more precise control try adjusting settings on the other tabs.

If most false matches have different Surnames, then try Set Preferences Tab ~ Names Matching Tab Last Wrong with a negative value for Individuals.

If many false matches are caused by the use of dummy forenames such as Not Known or Not Found, then replace them with abbreviations such as N.K. or NF, or with symbols such as ? or ~, as these are all ignored.

Return to top.

Why are some deliberate duplicates that I entered not listed?

Certain pairs of closely related Individuals are excluded from the list, including siblings, and parents/children. See the Family assessment for details.

If you deliberately add duplicates to the family tree to test the Plugin, they must be very distant relatives or in a different Pool.

The Set Preferences Tab ~ Family & Gender Tab explains how to include closely related Individuals if really needed.

Return to top.

What is the Soundex Codes cache and when should it be erased?

All Names of Individuals and Places are assigned a Soundex Code that allows similar sounding Names to be matched.

To minimise the Soundex Algorithm overhead, the Names and their Codes are stored in a lookup cache file.

Over time this cache could accumulate Names that no longer exist in the database, and may benefit from being erased using the Set Preferences Tab ~ User Interface Tab, and automatically repopulated the next time the Plugin is used.

Return to top.

How do I avoid the long run-time on my large GEDCOM database?

If a GEDCOM database contains thousands of Individual Records the Plugin can take many minutes to complete. So use the following strategy to avoid that long run-time.

  1. Run the Plugin once on the whole database using the Find any Duplicates… button.
  2. Work through the Result Set and merge any duplicate pairs using Edit > Merge/Compare Records and note any non-duplicate pairs.
  3. Then open the Plugin again and use the Omit Non-Duplicates Tab as necessary.
  4. Now the Show previous Result Set… button will omit Merged pairs and Non-Duplicates from the Result Set.
  5. Steps 2. to 4. can be repeated without suffering the time penalty of the Find any Duplicates… button.

When complete, use the Tick to include only Individuals updated after Plugin last run Date option. Then only updated Individuals will be checked in future, reducing run-time considerably.

Return to top.

Can an accidentally erased Omit Non-Duplicates list be recovered?

The Omit Non-Duplicates list is saved in the …\Family Historian Projects\{Project}\{Project}.fh_data\Plugin Data\ folder, and its filename is Find Duplicate Individuals.nondups.

If this list is accidentally erased, then on Windows 7 it can probably be easily retrieved.

Ensure that the file exists, by running the Plugin if necessary, then right-click on the file and choose Restore previous versions.

When the search completes, choose from the list of File versions presented, and click OK.

On other Windows versions the file will have to be retrieved from Project or PC backups.

Return to top.

What causes "The Internet appears to be inaccessible" message?

This Plugin checks the Plugin Store for a later version via the Internet. If there is no connection, this message is produced.

While there is no connection, the message is inhibited by all Plugins with this feature, until none of them have been run for 10 hours, or one of them detects the connection is restored.

Also the online Help & Advice pages will be inaccessible.

Return to top.

Return to Find Duplicate Individuals Plugin Introduction.

CC Attribution-Noncommercial-Share Alike 4.0 International
Runs using DokuWiki Recent changes RSS feed www.rjt.org.uk