A request for FH to output a full note in CSV files, including newlines.
Currently if a note contains paragraphs, only the first paragraph is output. Ellipses are output to indicate the missing bits of the note.
It would be a nice thing if the full note could be output. CSV files can have newlines embedded in fields. The de-facto method is to surround such fields with quotation marks which FH does already. See http://en.wikipedia.org/wiki/Comma-separated_values and http://tools.ietf.org/html/rfc4180.
Example:
'aaa','b CRLF
bb','ccc' CRLF
zzz,yyy,xxx
where aaa is one line and zzz is another line and the field bbb contains an embedded newline.
As an additional very minor request, could semicolons be added as one of the delimiters to commas and tabs.
http://www.fhug.org.uk/wishlist/wldispl ... lwlref=454
ID:4302
* Output of notes in CSV files
- PeterR
- Megastar
- Posts: 1129
- Joined: 10 Jul 2006 16:55
- Family Historian: V7
- Location: Northumberland, UK
Output of notes in CSV files
Jon,
I presume you've ruled out using the =GetParagraph function because that outputs each paragraph in a separate column? It also has the benefit of excluding 'hidden' text (in double square brackets).
I presume you've ruled out using the =GetParagraph function because that outputs each paragraph in a separate column? It also has the benefit of excluding 'hidden' text (in double square brackets).
Peter Richmond (researching Richmond, Bulman, Martin, Driscoll, Baxter, Hall, Dales, Tyrer)
Output of notes in CSV files
No, that's a useless function.
The whole point about notes is to keep free format text. You will never know ahead of time how many paragraphs are in each note. And if I put paragraphs into separate columns, I could end up with tens of extra columns for large notes. No, fixing the CSV export to allow embedded newlines according to the published standards would be a lot more useful.
Its also even more useless because it uses newlines as the mechanism, not actual paragraphs. So
'aaa
bbb'
is three paragraphs. 'aaa', '', and 'bbb' and
'ccc
ddd'
'ccc', 'ddd' is two.
I've written this message using paragraphs with no blank lines between them to show that it's harder to read/scan. Blanks line between paragraphs make them more readable and it feels more natural.
The hidden text aspect is one I'd not thought of. I suspect most CSV output is for further data analysis in spreadsheets and databases rather than sharing with other researchers (where Gedcom is more likely to happen). However it is an issue that does have a rightful place in this discussion since any export of data from a genealogy package could go anywhere and privacy is always something to bear in mind with exported data.
The whole point about notes is to keep free format text. You will never know ahead of time how many paragraphs are in each note. And if I put paragraphs into separate columns, I could end up with tens of extra columns for large notes. No, fixing the CSV export to allow embedded newlines according to the published standards would be a lot more useful.
Its also even more useless because it uses newlines as the mechanism, not actual paragraphs. So
'aaa
bbb'
is three paragraphs. 'aaa', '', and 'bbb' and
'ccc
ddd'
'ccc', 'ddd' is two.
I've written this message using paragraphs with no blank lines between them to show that it's harder to read/scan. Blanks line between paragraphs make them more readable and it feels more natural.
The hidden text aspect is one I'd not thought of. I suspect most CSV output is for further data analysis in spreadsheets and databases rather than sharing with other researchers (where Gedcom is more likely to happen). However it is an issue that does have a rightful place in this discussion since any export of data from a genealogy package could go anywhere and privacy is always something to bear in mind with exported data.
- ColeValleyGirl
- Megastar
- Posts: 4853
- Joined: 28 Dec 2005 22:02
- Family Historian: V7
- Location: Cirencester, Gloucestershire
- Contact:
Output of notes in CSV files
I wouldn't dismiss it as useless, Jon.
Although Notes are useful for holding free format text, they can alternatively be used to present formatted information with a particular paragraph structure, in which case GetParagraph is very useful for analysing the contents. (I've seen this recently in a GEDcom exported from another genealogy programme).
I do however agree that outputting the full note ought to be the default behaviour, with an option to omit elements marked as private.
Although Notes are useful for holding free format text, they can alternatively be used to present formatted information with a particular paragraph structure, in which case GetParagraph is very useful for analysing the contents. (I've seen this recently in a GEDcom exported from another genealogy programme).
I do however agree that outputting the full note ought to be the default behaviour, with an option to omit elements marked as private.
Helen Wright
ColeValleyGirl's family history
ColeValleyGirl's family history