Page 1 of 1

PDF Text Highlighting

Posted: 26 Aug 2015 14:53
by GeoffWalter
I have downloaded several newspaper extracts from findmypast in PDF format. I have tried to highlight relevant text using Adobe Reader XI without success for any of those pdf files. However, I have had success highlighting text using Adobe Reader XI in PDF files of the London Gazette downloaded from the National Archives. I confess I don't understand why one works and not the other.

Can anyone recommend another PDF viewer that might enable me to highlight text in the findmypast downloads?

Thank you.

Geoff

Re: PDF Text Highlighting

Posted: 26 Aug 2015 15:27
by DavidNewton
I think it depends on how the image was scanned. There is a brief explanation here

https://itaccessibility.illinois.edu/pd ... textvimage

I have done the cursor test on London Gazette images and the ones I have contain text, the cursor test on pdfs from FindMyPast indicate images only.

David

Re: PDF Text Highlighting

Posted: 26 Aug 2015 15:32
by tatewise
The complication is that, just like MS Word docs, PDF docs can be any combination of text and images.

If it is primarily text, then text highlighting works OK.

If it is just an image of a page, then text highlighting will not work.

See http://computers.tutsplus.com/tutorials ... -cms-20406 for a full explanation.

So you need an OCR utility, or a PDF tool with OCR built in like Adobe Acrobat.

One option is fhugdownloads:contents:irfanview|> Utility ~ IrfanView that has an Options > Start OCR plugin.

Re: PDF Text Highlighting

Posted: 26 Aug 2015 15:33
by brianlummis
There is a video from adobe on the approach you need with a scanned document that has been converted to PDF - have a look at
http://tv.adobe.com/watch/acrobat-xi-ti ... -pdf-file/ to see if that will help. I am getting the same problem as you with Foxit Reader and Nitro Reader!

[EDIT by Mike Tate: I think that video only applies to Adobe Acrobat, not the free Adobe Reader.]

[EDIT by Brian Lummis : According to the description it applies to Acrobat XI which was the software mentioned in the original post]

[EDIT by Mike Tate: Yes the description applies to Adobe Acrobat XI editor which must be bought, but the software mentioned was Adobe Reader XI which is the cut down free reader.

Re: PDF Text Highlighting

Posted: 26 Aug 2015 20:16
by jimlad68
I've not tried it, but depending on the software you use (possibly even extract the image), you may be able to put a "translucent" picture on top of a picture to create a highlight effect, or perhaps an underline or a box.

Re: PDF Text Highlighting

Posted: 26 Aug 2015 20:27
by DavidNewton
Try PDF-Xchange Editor free edition.

http://www.tracker-software.com/product ... nge-editor

David

Re: PDF Text Highlighting

Posted: 26 Aug 2015 20:41
by brianlummis
Have just been having another play with Foxit Reader 7 and you can highlight a rectangular area by going to Comment - Drawing - Area Highlight. The standard colour is bright yellow with 100% opacity but this can be altered to your own choice through the properties box. It does also generate a comment box (which depending on your point of view may or may not be a benefit) but this only opens when you double click the area.

Re: PDF Text Highlighting

Posted: 26 Aug 2015 21:35
by tatewise
Having re-read the original posting from Geoff, it has various interpretations.
One is that the text highlighting is to be retained in the PDF document to focus on specific text.
Another is that the text highlighting is required to extract a copy of the text for pasting into say a Text From Source field.

Re: PDF Text Highlighting

Posted: 27 Aug 2015 15:20
by GeoffWalter
Thanks to you all. I've been seeking to highlight text in the original image rather than extract text. From what findmypast have now told me it would seem that the issue relates to their scanning.

Geoff

Re: PDF Text Highlighting

Posted: 06 Sep 2015 20:50
by PeterR
It may be possible to use the OCR facility in fhugdownloads:contents:pdf-xchange|PDF-XChange; then the resulting text can be highlighted.