Page 1 of 1

pattern matching again

Posted: 18 Sep 2022 15:57
by Ron Melby
i have the following table entries:

husband's great (09x) grandaunt
great (11x) grandaunt
½ great (04x) granduncle

I want to change them to:
husband's (09x) great-grandaunt
(11x) great-grandaunt
½ (04x) great-granduncle

husband's great grandaunt >> I would expect this entry would be ignored or rpl t be nil in the following code.


swrlt = husband's great (09x) grandaunt
local pre, grt, rpl, pst = string.match(swrlt, '(.*) (great) %(%d+x%) (.*)')
pre = husband's
grt = great
rpl = (09x)
pst = grandaunt

great!!

swrlt = great (11x) grandaunt
this entry does not work, and the ? code has never worked worked for me as optional all return values are nil am I doing something wrong?

manual says . is 0 or more and * is 0 or more
I would expect the capture to be
pre = nil
grt = great
rpl = (11x)
pst = grandaunt

Re: pattern matching again

Posted: 18 Sep 2022 17:09
by tatewise
I won't go into great details.

Your match pattern has a mandatory space character before (great)
So if the string being matched has no such space character before great then it won't match.
i.e.
"husband's great (09x) grandaunt" does have that space and matches.
"great (11x) grandaunt" does NOT have that space so does NOT match.
So the space before great is optional, not mandatory

BTW: When I test your pattern match, rpl and pst do not hold what you suggest, so have you posted the correct pattern?
I think ( parentheses ) around %(%d+x%) are missing?

FYI:
The dot (.) pattern matches any one character.
The star (*) pattern means match 0 or more of the preceding item.

Re: pattern matching again

Posted: 18 Sep 2022 17:52
by Ron Melby

Code: Select all

function cvtRLT()

  for fhrlt, swrlt in pairs(lineage) do

    -- fhrlt = family historian relationship fhCallBuiltInFunction('Relationship', rptr, iptr, 'TEXT', ix)
    -- swrlt = software relationship

    swrlt = fhrlt
    swrlt = swrlt:gsub('once', '1 times'):gsub('twice', '2 times'):gsub('%-removed', ' removed')
    -- align leading numerics and (xd) padded to (0dx)
    swrlt = swrlt:gsub('^(%d[^%d])(.*)', ' %1%2')
    swrlt = swrlt:gsub('%(x(%d)%)', '(0%1x)'):gsub('%(x(%d+)%)','(%1x)')
    swrlt = swrlt:gsub('great%-', 'grand'):gsub('half', '½')

    -- pre = pre replacement area
    -- rpl = replacement area
    -- grt = great
    -- pst = post replacement area
    local pre
    local rpl
    local grt
    local pst

    pre, rpl, pst = string.match(swrlt, '(.*[^%d])(%d+ times)(.*)')
    if rpl then swrlt = ('%s(%02ix)%s'):format(pre, tonumber(string.match(rpl, '(%d+)')), pst) end
    pre, grt, rpl, pst = string.match(swrlt, '(.-)(great) (%(%d%dx%)) (.*)')
    if grt and rpl then swrlt = ('%s%s %s-%s'):format(pre or '', rpl, grt, pst) end
    lineage[fhrlt] = swrlt:gsub('great ', 'great-')
  end
end
this code works now. thanks. took a few minutes and going to documentatin that is not from puc-rico.

Re: pattern matching again

Posted: 18 Sep 2022 19:05
by tatewise
See FHUG Knowledge Base Understanding Lua Patterns.

If wanting maximum efficiency, your three lines of code:
pre, grt, rpl, pst = string.match(swrlt, '(.-)(great) (%(%d%dx%)) (.*)')
if grt and rpl then swrlt = ('%s%s %s-%s'):format(pre or '', rpl, grt, pst) end
lineage[fhrlt] = swrlt:gsub('great ', 'great-')
can be replaced with one line that avoids temporary variables and the format function:
lineage[fhrlt] = swrlt:gsub( "(.*)great (%(%d%dx%)) (.*)", "%1%2 great %3" ):gsub( "great ", "great-" )

Re: pattern matching again

Posted: 18 Sep 2022 20:08
by Ron Melby
thank you, I am all for the efficiency.

my new '_STD_MAT':matRLT require:

Code: Select all

-- materialize Relation
function matRLT(iptr, rptr, ix, __rtv)
  ix = ix or 1

  _ptyp  = type(iptr)
  if _ptyp  == 'userdata' then
    _ptag = fhGetTag(iptr)
    if _ptag ~= 'INDI' then
      error(('_STD_MAT.matRLT: INDI ptr unresolved: %s TAG: %s'):format(tostring(iptr) or '*null', _ptag or '?'))
    end
  elseif _ptyp == 'number' then
    local xptr = fhNewItemPtr()
    xptr:MoveToRecordById('INDI', iptr)
    iptr = xptr:Clone()
  else
    error(('_STD_MAT.matRLT: INDI ptr unresolved: %s TYPE: %s'):format(iptr or '*null', _ptyp or '?'))
  end

  if not rptr then
    rptr = _YROOT
  else
    _ptyp  = type(rptr)
    if _ptyp  == 'userdata' then
      _ptag = fhGetTag(rptr)
      if _ptag ~= 'INDI' then
        error(('_STD_MAT.matRLT: INDI _ROOT ptr unresolved: %s TYPE: %s'):format(iptr or '*null', _ptyp or '?'))
      end
    elseif _ptyp == 'number' then
      local xptr = fhNewItemPtr()
      xptr:MoveToRecordById('INDI', rptr)
      rptr = xptr:Clone()
    else
      error(('_STD_MAT.matiID: INDI _ROOT ptr unresolved: %s TYPE: %s'):format(iptr or '*null', _ptyp or '?'))
    end
  end

  local isrlt = ' '
  -- software relationship
  -- family historian relationship
  local fhrlt = (fhCallBuiltInFunction('Relationship', rptr, iptr, 'TEXT', ix)) or ''
  if fhrlt > '' then isrlt = '*' end

  if not lineage[fhrlt] and fhrlt > '' then
    -- fhrlt = family historian relationship fhCallBuiltInFunction('Relationship', rptr, iptr, 'TEXT', ix)
    -- swrlt = software relationship

    local swrlt = fhrlt
    swrlt = swrlt:gsub('once', '1 times'):gsub('twice', '2 times'):gsub('%-removed', ' removed')
    -- align leading numerics and (xd) padded to (0dx)
    swrlt = swrlt:gsub('^(%d[^%d])(.*)', ' %1%2')
    swrlt = swrlt:gsub('%(x(%d)%)', '(0%1x)'):gsub('%(x(%d+)%)','(%1x)')
    swrlt = swrlt:gsub('great%-', 'grand'):gsub('half', '½')

    -- pre = pre replacement area
    -- rpl = replacement area
    -- pst = post replacement area
    local pre
    local rpl
    local pst

    pre, rpl, pst = string.match(swrlt, '(.*[^%d])(%d+ times)(.*)')
    if rpl then swrlt = ('%s(%02ix)%s'):format(pre, tonumber(string.match(rpl, '(%d+)')), pst) end
    lineage[fhrlt] = swrlt:gsub('(.*)great (%(%d%dx%)) (.*)', '%1%2 great %3'):gsub('great ', 'great-')
  end

  _cs =
  {
    RLT = (lineage[fhrlt] or ''),
    isrlt = isrlt,
  }

  if not __rtv then
    cw.rlt = math.max(cw.rlt, UTF8len(_cs.RLT))
    for ged, add in pairs(_cs) do
      GEDRCD[#GEDRCD][ged] = add
    end
  end
  return _cs
end  -- fn matRLT

Re: pattern matching again

Posted: 18 Sep 2022 20:14
by Ron Melby
am I understanding this right?
lineage[fhrlt] = swrlt:gsub( "(.*)great (%(%d%dx%)) (.*)", "%1%2 great %3" ):gsub( "great ", "great-" )
at this point, there is nothing ENDING in great. I either have something like:
husband's great (09x) grandaunt
great (11x) grandaunt
½ great (04x) granduncle

and therefore could I save a gsub by:
lineage[fhrlt] = swrlt:gsub( "(.*)great (%(%d%dx%)) (.*)", "%1%2 great%-%3" )

Re: pattern matching again

Posted: 18 Sep 2022 20:51
by tatewise
Don't forget about husband's great grandaunt

That does not match the (%(%d%dx%)) pattern but still needs great converted to great-

Re: pattern matching again

Posted: 18 Sep 2022 20:57
by Ron Melby
and that right there is why you make the big bucks. I stand corrected on my correction.