Page 1 of 2
No searchable text in PDF reports
Posted: 30 Dec 2020 16:23
by PeterR
I've noticed a problem with all the Report types I've used so far, and also a Book:
when saved as PDF, every page consists of a single graphic image, with no searchable text. This explains two problems mentioned elsewhere: much larger file sizes, and much longer times. Printing to a PDF printer is no better. Searchable PDF files (as in earlier FH versions) would be much more useful.
By contrast all the Diagrams I've so far saved to PDF still have searchable text (as do Query Results) which can be a useful feature.
Now reported to CP: Ticket #422085.
Re: No searchable text in PDF reports
Posted: 30 Dec 2020 16:38
by tatewise
I don't recall that anyone has spotted that significant change, which would be a backward step IMO.
So not only do PDF take much longer to produce, and create much larger files, but they are not searchable ~ assuming you can create them at all, and I cannot.
Re: No searchable text in PDF reports
Posted: 30 Dec 2020 16:45
by Mark1834
Well done Peter, that explains what we have seen, and why text quality is so downgraded when downscaling images.
I'm not sure how to interpret that. I can't believe it was a deliberate move to take out that functionality (it means we can't copy text either), but neither can I credit that an established software house would make such a boo boo by accident...
Re: No searchable text in PDF reports
Posted: 30 Dec 2020 16:58
by AdrianBruce
Definite thanks from me as well, Peter - even though I'm a Middle Ages v6 user!
I have absolutely no experience of direct use of .PDF creation routines so I have no idea whether that's a simple slip up or what...
Re: No searchable text in PDF reports
Posted: 30 Dec 2020 17:02
by LornaCraig
I seem to recall that during beta testing someone picked up on the non-searchable text, and it was logged as a suggestion. I don't recall any indication of whether it was deliberate or not.
Re: No searchable text in PDF reports
Posted: 30 Dec 2020 17:03
by paultt
Ahhh! That explains why I cannot get any of the hyperlinks carefully constructed within the new editor to work from a generated pdf file. Never dreamt that the whole page was made into one image by the fh&nova7 combination!
Well spotted. I think I may have to consider going back to v6!
Re: No searchable text in PDF reports
Posted: 30 Dec 2020 17:19
by Mark1834
I realise we are trying to second guess Calico Pie, but I've tried printing an existing 50 page Word document containing mixed text and graphics using the FH PDF printer. It's a larger file than Microsoft Print to PDF produces (10.7 vs 6.5 MB), but the text is still fully searchable text. We know the FH reports are still text when displayed, as they output properly to RTF, so where is the conversion occurring...?
Re: No searchable text in PDF reports
Posted: 30 Dec 2020 17:44
by tatewise
The conversion is when FH formats the Report before sending it as an image per page to the Family Historian Nove PDF.
Re: No searchable text in PDF reports
Posted: 30 Dec 2020 18:54
by dewilkinson
To get a searchable pdf which is also a sensible size Save As rtf, load it into Word and Save As a pdf. I shall be using this route for the foreseeable future (tier 4 willing!)
Re: No searchable text in PDF reports
Posted: 30 Dec 2020 20:46
by tatewise
But as has been said before, that route does not work well if media images are required in the Report.
It does seem inconceivable that Calico Pie have deliberately gone down this path.
One neat trick with actual text in the PDF file is that a URL becomes an active hyperlink, and that feature is now missing.
Re: No searchable text in PDF reports
Posted: 30 Dec 2020 20:57
by dewilkinson
Indeed but work arounds are never perfect, lesser of two evils? The only other option I can see is manipulating images in Word before generating the pdf, but for a large report that is going to be tedious if not impossible. I'm with you, I can't believe Calico Pie has done this deliberately. It has crossed my mind to look other packages but I shan't do anything immediately as I thought FH6 was the best available, but ...
Re: No searchable text in PDF reports
Posted: 30 Dec 2020 21:09
by davepacey
Until the reports are fixed, I have no intention of getting rid of V6.27 the reporting is far superior, its just the hassle of swapping between the two, but the plugin route works just fine for me.
Re: No searchable text in PDF reports
Posted: 01 Jan 2021 00:39
by Andy63
Personally, I think they rush it out for Christmas and the PDF part of the software was left unfinished. It's impossible they didn't knew about it.
Re: No searchable text in PDF reports
Posted: 01 Jan 2021 10:09
by Mark1834
It's a difficult balance for software houses. Do they release a new version with known issues or wait until it is perfect but earn nothing from it? There is no right answer. A month ago, RootsMagic announced a public beta of version 8 by year-end, but it has not appeared. If Calico Pie were working to a firm year-end deadline, they had to make a judgement call on whether to leave this unfixed or not.
We have no way of knowing how many users this major flaw impacts - is it just a small vocal minority, or a significant number? The risk for CP is that magazine reviewers and bloggers pick it up, and reputations are lost much more quickly than they are gained.
Re: No searchable text in PDF reports
Posted: 01 Jan 2021 11:01
by Mark1834
A month ago, RootsMagic announced a public beta of version 8 by year-end, but it has not appeared.
Correction - it wasn't there yesterday afternoon (UK time), but is now available - just in time planning!
Re: No searchable text in PDF reports
Posted: 01 Jan 2021 11:19
by ColeValleyGirl
One software product I use entered beta testing for a new version on 20 November.
2017.
It has just updated its projected release date to 2021. I can only assume they're aiming for zero bugs, which is laudable but very annoying...
I think I prefer CP's approach -- PDF production isn't core functionality, after all, and there are workarounds (albeit inconvenient) for those who want to produce PDFs.
Re: No searchable text in PDF reports
Posted: 01 Jan 2021 12:02
by PeterR
PDF File Support has been part of FH functionality since at least version 5.0, almost 10 years ago, as the following quote from the Help topic shows (unchanged between FH5 and FH7):
Saving items in PDF file format is very convenient for a number of purposes - here are some:
It can be a useful way of sending charts, reports and books to other people - for example, as email attachments.
Good luck trying to send the bloated PDFs from FH7 by email!
Why have CP broken some functionality that already worked perfectly?
Re: No searchable text in PDF reports
Posted: 01 Jan 2021 17:05
by AdrianBruce
PeterR wrote: ↑01 Jan 2021 12:02
... Why have CP broken some functionality that already worked perfectly?
Why does any software company deliberately break functionality that already worked perfectly?
They don't. Seriously, having worked for a software organisation,
they don't. Not unless there are other issues elsewhere and it's a least worst option.
For most of us, there is a decent temporary work-round - saving the report to .RTF and PDFing it from our favourite word-processor. I wouldn't be surprised if that disrupts image positioning in some reports, that's always the first thing to go wonky in my experience, so it's not necessarily 100% effective.
I have no idea how to set Nova PDF (assuming that's what involved) to produce image-only PDFs - apparently it's as simple as a "Print as Image" checkbox in a dialog box in the full-fat Adobe Acrobat - so whether it's possible to accidentally configure an internal, unchangeable default for FH's routine, I've no idea. But that
may be what's happened here. Just a complete guess from me.
Re: No searchable text in PDF reports
Posted: 01 Jan 2021 17:14
by RS3100
I think that the formatting as an image is happening within FH, before the data is passed to the Nova PDF driver, since I've found that reports are printed as images in other PDF drivers that are installed on my PC, such as the CutePDF, Foxit and Quicken drivers.
I normally print to CutePDF and nothing has changed with that driver since I "upgraded" to FH7. It printed text searchable reports that I sent to it from FH6.2.
Re: No searchable text in PDF reports
Posted: 01 Jan 2021 17:24
by PeterR
I agree with RS3100. I have also tried a variety of PDF printers. The problem is not within NovaPDF. CP have accidentally or otherwise changed the way reports and books are output from FH for printing or saving as PDF. And I've tried the RTF route, but the printed layout is significantly inferior.
Re: No searchable text in PDF reports
Posted: 01 Jan 2021 17:34
by AdrianBruce
RS3100 wrote: ↑01 Jan 2021 17:14
I think that the formatting as an image is happening within FH, before the data is passed to the Nova PDF driver, ...
Hmm. It does sound like there's
either a common API to all those PDF drivers, one that we can't access, that has accidentally been set up to imagize PDFs,
or FH is always calling an imagize routine, when there was meant to be a dialog box somewhere in FH, Printer Set-up or whatever, that gave the option "Print As Image". Weird. Hopefully it is just a plumbing issue along those lines.
Re: No searchable text in PDF reports
Posted: 01 Jan 2021 17:36
by AdrianBruce
PeterR wrote: ↑01 Jan 2021 17:24
... I've tried the RTF route, but the printed layout is significantly inferior.
Darn - I was afraid of that...
Re: No searchable text in PDF reports
Posted: 01 Jan 2021 17:56
by paultt
For me, the RTF file route and then to PDF does not work. As soon as we go to .rtf, it will be assumed that the output will be on paper, and I cannot get a link to click on paper
I tested with a small report which I had entered website hyperlinks into the notes. If I save it as an html page, the links work ok.
If I redirect the report to the standard Micro$oft PDF printer that comes with windows, I get a blank pdf file, which is useless.
If you DO manage to get an rtf file to load it into say Word or Wordpad, it will mean searching for ALL the text that was a hyperlink, and actually setting the wordage to a hyperlink within Word, and then sending it to a pdf generator, like the Micro$oft PDF printer!
There are so many things in v7 for which I have had to change my workflow, and they nearly all involve extra mouse clicks or tabs, or manual interventions, that FH is
not an easy and friendly genealogy package anymore. Try copying some text say from a website or decent pdf, or even your email signature which has a link to your website, into one of the 'new' note fields using the rich text formatting. I have found that the paste actually messes up the formatting of the original, which is what I was hoping to keep. Lots of hidden gotchas, not just the pdf fiasco.
Re: No searchable text in PDF reports
Posted: 01 Jan 2021 19:36
by fhtess65
I was able to make the export to .rtf and then publish to .pdf work, however, I had to use Papyrus Author. I tried using both Word and Atlantis, but neither couldn't read the table from FH7. I then tried in LibreOffice, and same result as Word and Atlantis. See attached...
Very interesting that only a very specialized piece of software like Papyrus Author (free version) was able to open and render the .rtf properly. Not many people would even have it. I will try it in Scrivener as well and see what happens.
Re: No searchable text in PDF reports
Posted: 01 Jan 2021 20:42
by AdrianBruce
Hmmm. Thanks Teresa - looks like my idea of save as .RTF doesn't work with a lot of stuff, then.
I'm beginning to wonder if this transformation into an image is deliberate after all - if MS Word can't cope with some of the .RTF formatting - well, maybe NovaPDF has issues with whatever its text-based input format is as well? Hence rendering it as an image?
I have no idea about versions of RTF - whether NovaPDF and MS Word are behind in their versions of RTF (or whatever NovaPDF uses as input) - I guess that if Papyrus Author can cope, that may be so. Maybe CP can't find the correct RTF coding to render what they need in NovaPDF, so have rendered it as an image? More and more, it looks not nice to me ... Or maybe I'm thinking too much...
