* table and debug
table and debug
when I am debugging I can see the number of entries and the number of hashes:
_famOBJ => (table #16, .15)
I cannot figure out any way in my code to get the #hash.
does anyone know how I can do that?
thanks.
_famOBJ => (table #16, .15)
I cannot figure out any way in my code to get the #hash.
does anyone know how I can do that?
thanks.
FH V.6.2.7 Win 10 64 bit
- tatewise
- Megastar
- Posts: 28342
- Joined: 25 May 2010 11:00
- Family Historian: V7
- Location: Torbay, Devon, UK
- Contact:
Re: table and debug
# is the length operator
local intSize = #_famOBJ
intSize now contains 16
local strText = "ABC"
local intText = #strText
intText now contains 3
RTFM 2.5.5 – The Length Operator
local intSize = #_famOBJ
intSize now contains 16
local strText = "ABC"
local intText = #strText
intText now contains 3
RTFM 2.5.5 – The Length Operator
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
Re: table and debug
# is (after a fashion) the number of entries.
it is not the # of hash entries.
RTFOP 1.0
it is not the # of hash entries.
RTFOP 1.0
FH V.6.2.7 Win 10 64 bit
- tatewise
- Megastar
- Posts: 28342
- Joined: 25 May 2010 11:00
- Family Historian: V7
- Location: Torbay, Devon, UK
- Contact:
Re: table and debug
You originally posted:
Please fully explain what you mean by "the # of hash entries"
Do you mean the number after the dot, i.e. .15 in your example, which is impossible as it must be larger than the # value.
That needs a function to traverse the table:
I assumed you meant the #16 value you showed in (table #16, .15) and #_famOBJ does that._famOBJ => (table #16, .15)
I cannot figure out any way in my code to get the #hash.
Please fully explain what you mean by "the # of hash entries"
Do you mean the number after the dot, i.e. .15 in your example, which is impossible as it must be larger than the # value.
That needs a function to traverse the table:
Code: Select all
local function hash(t)
local max = 0
for j,k in pairs (t) do
max = max + 1
end
return max
end
local intHash = hash(_famOBJ)
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
Re: table and debug
when I took that debug snapshot the table was still 'early'
but when running any table, there is a 'key' part and a hash part.
quite often, you will see it:
rva(#111111111, .0)
you can see it:
xlb(#99393, .78967)
(these are made up numbers)
way back machine, I believe it was you told me the dot numbers were the hash part, doesnt matter though, someone did and it makes sense.
I am investigating how I might minimize table extents, and other issues.
I am assuming (and for good reason) that the dot does not mean a fractional number, but an indicator of the number of hash entries. fh debug knows it, it seems. and I would like to know it in my program.
For a program containing a hog slaughtering (or several hog slaughtering) tables, it will take up some huge time and memory and disk that is not directly solving the program problem.
I could prebuild the empty carcass of that table and save considerable computer, if I can know it. in the case of famOBJ while I have saved the last updated of every record in the file in a single famOBJ record, I am still unsure on your linked list (or even if I could create one) that would work to only update parts of the table, or some other method to do so, because if an individual changes, is removed, or added in a FAM that I could hunt down and correct every entry that would need fixing because it might occur FAM * (1/2 INDI) * (est) 8 times.
FAM is about 2800 at present INDI is about 7900, So, at present saving the table and updating the data in it is far more complex than the GPS table and linked list you helped me build, and beyond my doing.
So-- it would seem to me, that the best way I can do it, is load last famOBJ, grab up #entries, #hashes, dlt famOBJ build a famOBJ carcass from those bits of information, (there are tables in tables some numerically indexed, and some not) and spin code. I have it down to 10 minutes now, from more than 30 hours originally and 4 hours typically, and can only see reducing table extents as being my last great hope in reducing that time.
but when running any table, there is a 'key' part and a hash part.
quite often, you will see it:
rva(#111111111, .0)
you can see it:
xlb(#99393, .78967)
(these are made up numbers)
way back machine, I believe it was you told me the dot numbers were the hash part, doesnt matter though, someone did and it makes sense.
I am investigating how I might minimize table extents, and other issues.
I am assuming (and for good reason) that the dot does not mean a fractional number, but an indicator of the number of hash entries. fh debug knows it, it seems. and I would like to know it in my program.
For a program containing a hog slaughtering (or several hog slaughtering) tables, it will take up some huge time and memory and disk that is not directly solving the program problem.
I could prebuild the empty carcass of that table and save considerable computer, if I can know it. in the case of famOBJ while I have saved the last updated of every record in the file in a single famOBJ record, I am still unsure on your linked list (or even if I could create one) that would work to only update parts of the table, or some other method to do so, because if an individual changes, is removed, or added in a FAM that I could hunt down and correct every entry that would need fixing because it might occur FAM * (1/2 INDI) * (est) 8 times.
FAM is about 2800 at present INDI is about 7900, So, at present saving the table and updating the data in it is far more complex than the GPS table and linked list you helped me build, and beyond my doing.
So-- it would seem to me, that the best way I can do it, is load last famOBJ, grab up #entries, #hashes, dlt famOBJ build a famOBJ carcass from those bits of information, (there are tables in tables some numerically indexed, and some not) and spin code. I have it down to 10 minutes now, from more than 30 hours originally and 4 hours typically, and can only see reducing table extents as being my last great hope in reducing that time.
FH V.6.2.7 Win 10 64 bit
- tatewise
- Megastar
- Posts: 28342
- Joined: 25 May 2010 11:00
- Family Historian: V7
- Location: Torbay, Devon, UK
- Contact:
Re: table and debug
Back to basics. See Lua 5.1 Reference Manual 2.5.7 – Table Constructors
All table entries are composed of a key and a value, e.g. tblList[key] = value
Each key can be an integer (i.e. 1, 2, 3, 4, 999, 75643) or a name (i.e. "Alpha", "beta", "James")
The # number is how many consecutive integer keys exist starting from 1.
The . number is the total number of keys (so it must be greater or equal to the # number).
The only way to compute that . number is to run the function I posted earlier.
I guess the Fh debug must perform a similar computation in the user interface.
You could use the penlight library. This has a function size which gives the actual size of the table.
However, I suspect that under the hood it just uses the same pairs loop.
This question has been asked several times on https://stackoverflow.com/ with the same answer.
All table entries are composed of a key and a value, e.g. tblList[key] = value
Each key can be an integer (i.e. 1, 2, 3, 4, 999, 75643) or a name (i.e. "Alpha", "beta", "James")
The # number is how many consecutive integer keys exist starting from 1.
The . number is the total number of keys (so it must be greater or equal to the # number).
The only way to compute that . number is to run the function I posted earlier.
I guess the Fh debug must perform a similar computation in the user interface.
You could use the penlight library. This has a function size which gives the actual size of the table.
However, I suspect that under the hood it just uses the same pairs loop.
This question has been asked several times on https://stackoverflow.com/ with the same answer.
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
Re: table and debug
and answered incorrectly in every case.
I dont care about the actual number of entries; the # value is good enough for the ancient lua I am using.
I care about the separate number of entries in the hash portion.
thats why I used two different words and each a different concept.
entries ~= hash entries
I will guess that nobody knows how since answers are non-sequitur. But thanks, thats ok, something else I cant do, or can estimate rightly or wrongly, so I guess I will try to find time elsewhere.
the final on the debug is:
_famOBJ(#3031, .2890)
so, knowing what is in there, they are zero based numbers, where:
# == number of table entries space allocated (3032)
. == number of actual table entries (2891, my number of FAM records)
so fh does not know the number of hash entries.
I dont care about the actual number of entries; the # value is good enough for the ancient lua I am using.
I care about the separate number of entries in the hash portion.
thats why I used two different words and each a different concept.
entries ~= hash entries
I will guess that nobody knows how since answers are non-sequitur. But thanks, thats ok, something else I cant do, or can estimate rightly or wrongly, so I guess I will try to find time elsewhere.
the final on the debug is:
_famOBJ(#3031, .2890)
so, knowing what is in there, they are zero based numbers, where:
# == number of table entries space allocated (3032)
. == number of actual table entries (2891, my number of FAM records)
so fh does not know the number of hash entries.
FH V.6.2.7 Win 10 64 bit
- ColeValleyGirl
- Megastar
- Posts: 5465
- Joined: 28 Dec 2005 22:02
- Family Historian: V7
- Location: Cirencester, Gloucestershire
- Contact:
Re: table and debug
Mike, it would be helpful if you pointed to a StackOverflow query, not just the entirety of StackOverflow!
Ron, https://stackoverflow.com/a/29930168/1943174 might provide food for thought.
Ron, https://stackoverflow.com/a/29930168/1943174 might provide food for thought.
Helen Wright
ColeValleyGirl's family history
ColeValleyGirl's family history
- tatewise
- Megastar
- Posts: 28342
- Joined: 25 May 2010 11:00
- Family Historian: V7
- Location: Torbay, Devon, UK
- Contact:
Re: table and debug
Ron, where are you getting that _famOBJ(#3031, .2890) display? A screenshot might help.
As I have said before, the Fh lower right debug pane shows the Value such as (table #1234 .4321)
So I guess you are talking about some other debug feature where the # number is space allocated.
I still have no idea what hash entries are. Saying 'entries ~= hash entries' does not help me.
Can you refer to some documentation that explains what Lua table hash entries are.
As I have said before, the Fh lower right debug pane shows the Value such as (table #1234 .4321)
So I guess you are talking about some other debug feature where the # number is space allocated.
I still have no idea what hash entries are. Saying 'entries ~= hash entries' does not help me.
Can you refer to some documentation that explains what Lua table hash entries are.
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
Re: table and debug
https://www.lua.org/gems/sample.pdf
this is where I got the discussion on the hash entries, and it is a robust algorithm used universally.
consider how lua must handle the following
for i + 1, 1000 do
table.insert(fred, i)
or:
<start segment a>
fred[1] = {ID = fID, usig = usig}
fred[2] = {ID = HUSB.iID, usig = usig
fred[3] = {ID = WIFE.iID, usig = usig}
n = 4
c = 1
while ptrCHIL:IsNotNull() do
fred[n] = {ID = CHIL[c].iID, usig = usig}
c = c + 1
n = n + 1
ptrCHIL:MoveNext('SAME_TAG')
end
</end segment a>
<start segment b>
z = n + 1
for i= z, 1000 do
k = 'miketate'.. i
fred[k] = {ID = 0, usig = 0}
end
</end segment b>
Lua table functions behind the scenes are hashed.
in segment a it must make a decision: do i need to hash this value to determine a slot in the table?
nope, one entery [_KEY_] is 1
repeat for next entries in segment a
then segment b starts same table
decision: do i need to hash...?
yup create(although originally I believe the hash area is 16 entries based on what I have read over weeks of this quest)
do I need more hashspace? at sometime the answer will be (at 17) yes.
recreate a table space based on the next multiple of 2 power if necessary , and recreate hash part of table power of 2 (32 now)
copy old table into new table space... throw away pointer to old table update pointer to this table (probably by a pointer dictionary) since I do not see the table address change in _G or in my program...
and so on.
_famOBJ(#3031, .2890)
^this is coming from the debug screen when I have finished entering all the entries in my table just before I save it.
save (from chillcode) which I have not debugged exhaustively but saw in debug if I hit the break just right (it is in my code a require) so not all variables are visible all the time -- tables and lookup are over 2.5 million entries each. And looking at the output table _famOBJ.dat saved to disk, it indeed is a lot of tables.
this is where I got the discussion on the hash entries, and it is a robust algorithm used universally.
consider how lua must handle the following
for i + 1, 1000 do
table.insert(fred, i)
or:
<start segment a>
fred[1] = {ID = fID, usig = usig}
fred[2] = {ID = HUSB.iID, usig = usig
fred[3] = {ID = WIFE.iID, usig = usig}
n = 4
c = 1
while ptrCHIL:IsNotNull() do
fred[n] = {ID = CHIL[c].iID, usig = usig}
c = c + 1
n = n + 1
ptrCHIL:MoveNext('SAME_TAG')
end
</end segment a>
<start segment b>
z = n + 1
for i= z, 1000 do
k = 'miketate'.. i
fred[k] = {ID = 0, usig = 0}
end
</end segment b>
Lua table functions behind the scenes are hashed.
in segment a it must make a decision: do i need to hash this value to determine a slot in the table?
nope, one entery [_KEY_] is 1
repeat for next entries in segment a
then segment b starts same table
decision: do i need to hash...?
yup create(although originally I believe the hash area is 16 entries based on what I have read over weeks of this quest)
do I need more hashspace? at sometime the answer will be (at 17) yes.
recreate a table space based on the next multiple of 2 power if necessary , and recreate hash part of table power of 2 (32 now)
copy old table into new table space... throw away pointer to old table update pointer to this table (probably by a pointer dictionary) since I do not see the table address change in _G or in my program...
and so on.
_famOBJ(#3031, .2890)
^this is coming from the debug screen when I have finished entering all the entries in my table just before I save it.
save (from chillcode) which I have not debugged exhaustively but saw in debug if I hit the break just right (it is in my code a require) so not all variables are visible all the time -- tables and lookup are over 2.5 million entries each. And looking at the output table _famOBJ.dat saved to disk, it indeed is a lot of tables.
FH V.6.2.7 Win 10 64 bit
- tatewise
- Megastar
- Posts: 28342
- Joined: 25 May 2010 11:00
- Family Historian: V7
- Location: Torbay, Devon, UK
- Contact:
Re: table and debug
I have no experience of those hash entries so cannot help you. Sorry!
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
Re: table and debug
from the chillcode save and load I have two questions that I am wondering why, and hope someone can explain.
in the export string,
I understand it 'lua normalizes' the string with %q
then it removes excess '\' from line feed and carriage returns.
str = string.gsub(str, string.char(26), '\' .. string.char(26) .. \'')
I do not understand this line, or the why of it, I believe (26) to be a substitute character.
such that 'invalidchar' will become '\someinvalidchar\' and no idea what the final '' does.
is that true? why does it need this special handling in a table? nil it or void it...maybe? and doesnt fh put a '?' there anyhow?
str = string.gsub(str, string.char(26), '\' .. string.char(26) .. \'')
Next, when it saves the table to a file, if it finds any tables in tables, it builds reindexed subtables. for example:
I have 2890 _famOBJ in 1 table each table entry keyed on the numeric fID (fhGetRecordId(ptrFAM)). some numbers are missing sequentially) we will call it a table.
_famOBJ [fID] =
{
EPI = array of [1] to [n] of {iID, usig,}
HDR = table of 1 entry {_fix, fPTR, fID, FAM, MDAT, mdat, DDAT, ddat, fanc{-- ancestors: father, mother ... g(n)gf, g(n)gm},
HUSB = table of 1 entry **repeating structure
WIFE = table of 1 entry **repeating structure
CHIL = array of [1] to [n] of **repeating structure, {fPTR, fID, fpa, fby,}
}
**repeating structure = -- INDI values ++ arrays>>
{
ORD, PTG, iID, fID, fPTR, NAME, iPTR, SEX, ERA, BIRTH, DEATH,
famc = array of [1] to [n] of **repeating structure, {fPTR, fID, fpa, fby,}, -- fam as child and adop or step
fams = array of [1] to [n] of {fID, fPTR, mdptr, fmdat, MDAT, mdat, ddptr, dmdat, DDAT, ddat,}, -- fam as spouse
fanc = array of [1] to [n] of **repeating structure, -- ancestors: father, mother ... g(n)gf, g(n)gm
fdsc = array of [1] to [n] of **repeating structure, -- descendants: child, ... g(n)gc
}
the save function creates 2103094 (that is correct) 2 million 103 thousand 94 redirected table chunks of entries. Which takes a great deal of time, is complex to follow on the disk, and puts me at the very edge of running out of memory, without me adding certain pieces of information I need to. to debug puts me out of memory.
So on the question for this, can anyone give me an example of a table structure that would need this sort of slicing and dicing to save? Why SHOULDN't it be simplified to nearly straight in and out?
in the export string,
I understand it 'lua normalizes' the string with %q
then it removes excess '\' from line feed and carriage returns.
str = string.gsub(str, string.char(26), '\' .. string.char(26) .. \'')
I do not understand this line, or the why of it, I believe (26) to be a substitute character.
such that 'invalidchar' will become '\someinvalidchar\' and no idea what the final '' does.
is that true? why does it need this special handling in a table? nil it or void it...maybe? and doesnt fh put a '?' there anyhow?
str = string.gsub(str, string.char(26), '\' .. string.char(26) .. \'')
Next, when it saves the table to a file, if it finds any tables in tables, it builds reindexed subtables. for example:
I have 2890 _famOBJ in 1 table each table entry keyed on the numeric fID (fhGetRecordId(ptrFAM)). some numbers are missing sequentially) we will call it a table.
_famOBJ [fID] =
{
EPI = array of [1] to [n] of {iID, usig,}
HDR = table of 1 entry {_fix, fPTR, fID, FAM, MDAT, mdat, DDAT, ddat, fanc{-- ancestors: father, mother ... g(n)gf, g(n)gm},
HUSB = table of 1 entry **repeating structure
WIFE = table of 1 entry **repeating structure
CHIL = array of [1] to [n] of **repeating structure, {fPTR, fID, fpa, fby,}
}
**repeating structure = -- INDI values ++ arrays>>
{
ORD, PTG, iID, fID, fPTR, NAME, iPTR, SEX, ERA, BIRTH, DEATH,
famc = array of [1] to [n] of **repeating structure, {fPTR, fID, fpa, fby,}, -- fam as child and adop or step
fams = array of [1] to [n] of {fID, fPTR, mdptr, fmdat, MDAT, mdat, ddptr, dmdat, DDAT, ddat,}, -- fam as spouse
fanc = array of [1] to [n] of **repeating structure, -- ancestors: father, mother ... g(n)gf, g(n)gm
fdsc = array of [1] to [n] of **repeating structure, -- descendants: child, ... g(n)gc
}
the save function creates 2103094 (that is correct) 2 million 103 thousand 94 redirected table chunks of entries. Which takes a great deal of time, is complex to follow on the disk, and puts me at the very edge of running out of memory, without me adding certain pieces of information I need to. to debug puts me out of memory.
So on the question for this, can anyone give me an example of a table structure that would need this sort of slicing and dicing to save? Why SHOULDN't it be simplified to nearly straight in and out?
FH V.6.2.7 Win 10 64 bit
- tatewise
- Megastar
- Posts: 28342
- Joined: 25 May 2010 11:00
- Family Historian: V7
- Location: Torbay, Devon, UK
- Contact:
Re: table and debug
In a literal string such as 'this is a string' if you want the ' character it must be escaped as \'
e.g. 'this is \' a string'
But it could have been written as "this is ' a string" using double-quotes instead.
So '\'..string.char(26)..\'' could be written as "'..string.char(26)..'"
i.e. the resultant text string is literally '..string.char(26)..'
So the one-byte ASCII code 26 SUB character becomes the 21 character string '..string.char(26)..'
When that 21 character string is 'compiled' by the loadfile() function it turns back into the one-byte code 26 SUB character.
That also explains the { } table constructor nesting. The loadfile() function 'compiles' it into a table.
Review the loadfile() function in the Lua Reference Manual.
e.g. 'this is \' a string'
But it could have been written as "this is ' a string" using double-quotes instead.
So '\'..string.char(26)..\'' could be written as "'..string.char(26)..'"
i.e. the resultant text string is literally '..string.char(26)..'
So the one-byte ASCII code 26 SUB character becomes the 21 character string '..string.char(26)..'
When that 21 character string is 'compiled' by the loadfile() function it turns back into the one-byte code 26 SUB character.
That also explains the { } table constructor nesting. The loadfile() function 'compiles' it into a table.
Review the loadfile() function in the Lua Reference Manual.
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
Re: table and debug
https://www.lua.org/pil/8.html
where are you getting in depth meaningful information? cuz that is as enlightening as winking at a girl in a dark closet, you know what you want......
So, by reading and wasting actual moments of my life more than 50 times since I have been trying to figure out how to do this, by what is not said there, the reason that it is so convoluted and wasteful is that there may be a function table there. because it doesnt use loadfile nor loadstring to bring it in, just table for k, v do.
So if there are no functions in the table, it could be saved straightforwardly and loaded straightforwardly.
The substitute thing, thanks, I get what you are saying it does, and will try to work out how that statement does that replacement.
where are you getting in depth meaningful information? cuz that is as enlightening as winking at a girl in a dark closet, you know what you want......
So, by reading and wasting actual moments of my life more than 50 times since I have been trying to figure out how to do this, by what is not said there, the reason that it is so convoluted and wasteful is that there may be a function table there. because it doesnt use loadfile nor loadstring to bring it in, just table for k, v do.
So if there are no functions in the table, it could be saved straightforwardly and loaded straightforwardly.
The substitute thing, thanks, I get what you are saying it does, and will try to work out how that statement does that replacement.
FH V.6.2.7 Win 10 64 bit
- tatewise
- Megastar
- Posts: 28342
- Joined: 25 May 2010 11:00
- Family Historian: V7
- Location: Torbay, Devon, UK
- Contact:
Re: table and debug
I use the tableloadsave v 0.94 Lua 5.1 compatible variant Load Function that has loadstring() and loadfile():
The tableloadsave v 1.0 Lua 5.2 compatible variant Load Function uses loadfile():
See http://lua-users.org/wiki/SaveTableToFile which among other things says:
Code: Select all
--// The Load Function
function table.load( sfile )
local tables,err
-- catch marker for stringtable
if string.sub( sfile,-3,-1 ) == "--|" then
tables,err = loadstring( sfile )
else
tables,err = loadfile( sfile )
end
Code: Select all
--// The Load Function
function table.load( sfile )
local ftables,err = loadfile( sfile )
If you don't like the way that works then write your own table save and load functions.NOTICE: NOT suitable for huge tables. Saving a table containing 100.000 subtables with 25 (1 short text line and 24 numerical entries) fields works. Somewhere above that performance is disastrous when saving. And you get an "constant table overflow" error when you try to read it back. With smaller tables it seems to work. Especially considering it is a cut and past solution to your need. There is some test code that shows that you can very well save such big tables, of course this is also a machine issue.
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
Re: table and debug
So, it is as I expect. I have read all that, it is what I use. I was trying to find out why it was necessary to cleave the tables and arrays into parts.
I finally have a table where I have to write my own load and save, cuz it is a big one, even EditPad strains under it.
Thanks, it helps considerably.
I finally have a table where I have to write my own load and save, cuz it is a big one, even EditPad strains under it.
Thanks, it helps considerably.
FH V.6.2.7 Win 10 64 bit