Introduction
Some Plugins perform intensive repetitive operations, which on a large database in excess of 10,000 Individuals, may take a long time or need large amounts of memory.
This article suggests how these resources can be minimised by using a few simple techniques. If the Plugin run time is measured in minutes rather than seconds, then even a 10% saving becomes significant.
Global v Local
As a general rule local variables are more efficient than global variables, but there are exceptions.
When a function requires a lookup table of constants, such as below, then it is faster if it is global.
TblLookup = { A=1,E=2,I=3,O=4,U=5 }
This is because the global table is only created once, whereas a local table is created every time the function is called.
It can also help to define local variables that reference global variables, especially an indexed table entry. e.g.
local tblCode = TblLookup local intNumb = IntNumber local tblMode = TblMode[intNumb]
Where the same table lookup or other complex operation is required multiple times, then assign the result to a local variable and use it multiple times instead. e.g.
if tblCode[strA] > 1 and tblCode[strA] < intLast then intSum = intSum + tblCode[strA] end
becomes
local intCode = tblCode[strA] -- Look up the value once if intCode > 1 and intCode < intLast then intSum = intSum + intCode -- Use the local variable three times end
Some of the above techniques are illustrated in the Soundex (code snippet) examples, where the Global Variable Version runs about 5 times faster than the Local Variable Version, and the Function Prototype Version takes things one step further.
Although a local function within another function clarifies its scope of use, a global function is faster. Maybe it is due to the global function being defined only once, whereas a local function is defined every time its container function is called.
Progress Bar
The Progress Bar (code snippet) provides useful feedback and a cancel option for long running PluginsPlugins are small programs that allow new features to be added without upgrading Family Historian itself; some plugins are written by Calico Pie and others are written by users.. However, if the Global Variable Version is called too frequently, its own code can significantly extend the run time.
Therefore, it may be better to avoid calling the ProgressDisplay.Step function on every loop step. e.g.
ProgressDisplay.Start("Loop Progress",9000) intSteps = 0 for i = 1, 9000 do intSteps = intSteps + 1 if intSteps == 100 then -- Note that if intSteps % 100 == 0 then is slower intSteps = 0 ProgressBar.Step(100) -- Only update Progress Bar every 100 steps end if ProgressBar.Stop() then break end end
This problem has been mitigated in the Function Prototype Version by only updating the display when necessary instead of every Step.
Do not make the ProgressBar.Stop() function conditional on the intSteps count, otherwise it may make interrupting the loop and other interactions less responsive.
Large Files
Sometimes it is necessary to process the contents of large files line by line. For this it is much faster to use table.insert
and table.concat
than string concatenation strText = strText..strLine. e.g.
local tblText = {} for strLine in io.lines(strFile) do -- Read through the file line by line strLine = strLine:gsub("abc","xyz") table.insert(tblText,strLine) -- Insert the line of text as the next table entry end local strText = table.concat(tblText,"\n") -- Concatenate the lines of text separated by newline SaveStringToFile(strText,strFile) -- See the Save String To File (code snippet)
Large Tables
Very large tables of data can arise, say when keeping results of each Individual Record in the database compared with every other Individual RecordEvery person in your tree will have a single Individual Record, which holds all the information about that individual that you have entered. You can view and edit Individual records in the Property Box Dialogue.. For smaller databases up to 10,000 Individuals, this amounts to less than 50,000,000 entries, but quickly escalates for larger databases, and can exhaust available memory.
To avoid this problem the table of results should be sorted and the lowest entries pruned off. e.g.
if intScore >= intMinimum then -- Continue if score is above lowest retained Results entry table.insert(tblResults,{ Score=intScore, ... }) if #tblResults >= 2000 then -- Prune low scores from Results to avoid exhausting memory table.sort( tblResults, function(tblA,tblB) return tblA["Score"] > tblB["Score"] end ) for i = 1 , #tblResults / 2 do table.remove(tblResults) -- Remove the lower 50% of the sorted Results end intMinimum = tblResults[#tblResults]["Score"] end end
Data Tables
When testing for alternative data values it is tempting to use if … then … else …
structures, but when there are more than a few values it can become inefficient. Consider the following where each data reference tag is tested several times:
ptrIndi = fhNewItemPtr() ptrIndi:MoveToFirstRecord("INDI") while ptrIndi:IsNotNull() do local ptrData = fhNewItemPtr() ptrData:MoveToFirstChildItem(ptrIndi) while ptrData:IsNotNull() do local strTag = fhGetTag(ptrData) if strTag == "NAME" then -- Handle names elseif strTag == "FAMS" then -- Handle spouse elseif strTag == "SOUR" then -- Handle source end ptrData:MoveNext() end ptrIndi:MoveNext() end
The following data table method is more efficient, as each data reference is only tested once. It becomes even more efficient as the number of alternatives increases. The only condition is that all the functionsA ‘function’ is an expression which returns values based on computations. Typically, functions require data to be supplied to them as ‘parameters’. A function in Family Historian is similar to a ‘function’ as used in spreadsheet applications (such as MS must support the same parameters.
function HandleNames(ptrData) -- Handle names here end function HandleSpouse(ptrData) -- Handle spouse here end function HandleSource(ptrData) -- Handle source here end function Null() -- Handle anything else end tblWhat = { -- Translate data tag to function NAME = HandleNames; FAMS = HandleSpouse; SOUR = HandleSource; } ptrIndi = fhNewItemPtr() ptrIndi:MoveToFirstRecord("INDI") while ptrIndi:IsNotNull() do local ptrData = fhNewItemPtr() ptrData:MoveToFirstChildItem(ptrIndi) while ptrData:IsNotNull() do local strTag = fhGetTag(ptrData) local action = tblWhat[strTag] or Null action(ptrData) -- Call one of the functions above ptrData:MoveNext() end ptrIndi:MoveNext() end