Page 1 of 1

Mysterious doubling of file size

Posted: 30 Jun 2016 22:12
by quarlton
I wonder if anyone has come across this before:-

I have been given a gedcom created in FH v5
It is 20MB

Looked at in a text editor it reports:
Lines: 726,153
Length: 20,314,388

I then open it in FH 6.2.2
Simply click on Save, and then exit.

Windows Explorer (Win 10) now reports it as 40 MB!

Re-opening in text editor the values are increased:-

Lines: 733,993 (up 7,840)
Length: 20,720,077 (up 405,689)
This is accounted for by there now being a list of all places at the end of the file.
The size of this list matches the extra lines and bytes and obviously is nothing like the extra 20MB that Windows thinks is there.

The only major difference I can spot is that the original was encoded as ANSI and the new one as UNICODE


[As an afterthought, I uploaded to my site using FTP, and that also reports it as 40MB]
Puzzled

Dave Simpson

Re: Mysterious doubling of file size

Posted: 01 Jul 2016 00:16
by BillH
Dave,

I believe this is due to FH 6 using UTF-16 as its default file format. This makes the file twice as large. You can override this to UTF-8 using Tools > Preferences > File Load/Save and clicking on Save in UTF-8 file format.

Bill

Re: Mysterious doubling of file size

Posted: 01 Jul 2016 07:54
by quarlton
Hi Bill

Many thanks for that, it solved the issue.
Because my own files are relatively small I have never had reason to notice.
It was because the new file was so much larger that it jumped out at me.

Many thanks