You are not logged in.
Hi,
I have a large number of lines like:
line-1
line-2
line-3
line-4
line-5
line-6
...
...and want to convert every 3 successive lines to 3 columns of a table
TABLE:
Col-1 Col-2 Col3
Row 1 line-1 line-2 line-3
Row 2 line-4 line-5 line-6
...Actually the lines are too many, so manually will be too slow!
Thanks in advance,
Last edited by makh (2021-12-01 20:19:12)
OS: Arch &/ Debian
System: LENOVO ThinkPad E14
Desktop: Xfce
Offline
Do you have some formatting or is that pure text? For pure text I'd use the shell or python to create a csv or tsv table (works best if you have no tabs in your text). If your data has tabs, choose a different delimiter like semicolon ; , pipe |, number sign #, tilde ~, ...
paste -s -d"\t\t\n" plaintext.file > output.tsvThen open it in Calc and copy to Writer.
Some options to do it purely in Calc or Writer: https://ask.libreoffice.org/t/select-ev … iter/60337
Last edited by progandy (2021-11-13 21:12:23)
| alias CUTF='LANG=en_XX.UTF-8@POSIX ' | alias ENGLISH='LANG=C.UTF-8 ' |
Offline
Hi
@progandy
The file is completely text. I added semi colon to end of each line. I tried paste command as follows, but as I open in the calc, it doesnt constructs the 3 columns. It only makes the 2 columns.
sed '/^[[:space:]]*$/d' file1.txt | paste -d ";" - - - >file2.csvThankyou
OS: Arch &/ Debian
System: LENOVO ThinkPad E14
Desktop: Xfce
Offline
awk '!(NR%3){print p " " q " " $0} {p=q; q=$0}' file1.txtOffline
awk '!(NR%3){print p " " q " " $0} {p=q; q=$0}' file1.txt
Hi
Actually I have sentences, so adding space will not benefit:
The Solar System is the gravitationally bound system.
The Sun and the objects that orbit it, either directly or indirectly.
The Earth is round shaped.
The Moon Circles the Earth.
The Solar Year is 365 days.
...
...
<more lines>Sorry I miss-quoted by saying lines in start.
Thats why I tried to add semi-colon in the end of each sentence. To make calc convert into 3 columns, but it didnt help.
Thankyou
OS: Arch &/ Debian
System: LENOVO ThinkPad E14
Desktop: Xfce
Offline
Like this? paste takes one line from the file, adds the current separator from the given list, takes the next line, adds the next separator and so on.
paste -s -d ";;\n" in.txt >out.csvLast edited by progandy (2021-11-18 05:03:40)
| alias CUTF='LANG=en_XX.UTF-8@POSIX ' | alias ENGLISH='LANG=C.UTF-8 ' |
Offline
Are these plain text files, or are they a rich format like .docx? If they are plain text there are many ways to do it. One is to load the file as a list of rows in Python, then do a simple for loop to transform it. Another one is to do a regex substitution like "s/([^\t]+)\n([^\t]+)\n([^\t]+)\n/\1\t\2\t\3\n/" (for example with sed). The awk method above is a third one.
If they are rich text, it's harder. You can try to play with LO Writer's own search & replace with regex enabled. You could also try converting them to something easier like HTML with pandoc, transform there, then copy back into the document.
Offline
You can just replace the blank with whatever you want…
awk '!(NR%3){print p "|" q "|" $0} {p=q; q=$0}' file1.txt
awk '!(NR%3){print p "\t" q "\t" $0} {p=q; q=$0}' file1.txt
awk '!(NR%3){print p "?" q "?" $0} {p=q; q=$0}' file1.txtnb that it won't deal w/ a tail (ie. if the number of lines is not a multiple of three, the last one or two lines are not even printed!)
Offline
Like this? paste takes one line from the file, adds the current separator from the given list, takes the next line, adds the next separator and so on.
paste -s -d ";;\n" in.txt >out.csv
Hi
Your method works okay for the English Text; upto columns in Calc.
But my data has RTL or Urdu Language and Arabic lines with English. This command doesnt works appropriate.
In English one it successfully merges all into one line, but in the multi language; it is not creating one line from 3 lines.
Thankyou
OS: Arch &/ Debian
System: LENOVO ThinkPad E14
Desktop: Xfce
Offline
Are these plain text files, or are they a rich format like .docx? If they are plain text there are many ways to do it. One is to load the file as a list of rows in Python, then do a simple for loop to transform it. Another one is to do a regex substitution like "s/([^\t]+)\n([^\t]+)\n([^\t]+)\n/\1\t\2\t\3\n/" (for example with sed). The awk method above is a third one.
If they are rich text, it's harder. You can try to play with LO Writer's own search & replace with regex enabled. You could also try converting them to something easier like HTML with pandoc, transform there, then copy back into the document.
Hi
The file is text only! But I dont have much command on awk or python.
Thankyou
OS: Arch &/ Debian
System: LENOVO ThinkPad E14
Desktop: Xfce
Offline
You can just replace the blank with whatever you want…
awk '!(NR%3){print p "|" q "|" $0} {p=q; q=$0}' file1.txt awk '!(NR%3){print p "\t" q "\t" $0} {p=q; q=$0}' file1.txt awk '!(NR%3){print p "?" q "?" $0} {p=q; q=$0}' file1.txtnb that it won't deal w/ a tail (ie. if the number of lines is not a multiple of three, the last one or two lines are not even printed!)
Hi
In this way calc is making only two columns and not converting correctly. I think the every three lines should be on one line to get this method working.
Thankyou
OS: Arch &/ Debian
System: LENOVO ThinkPad E14
Desktop: Xfce
Offline
Can you please post a sample of the text you're actually dealing with?
awk '!(NR%3){print p ";" q ";" $0} {p=q; q=$0}' will combine tree lines into one and separate them w/ semicolons.
The common format is https://en.wikipedia.org/wiki/Comma-separated_values which will mandate some adjustments to the fields (notably quotation of commas) and the field separator is a comma, not a semicolon.
Also
I added semi colon to end of each line.
might be a problem, because the result will look like either
line1;line2;line3;
or
line1;;line2;;line3;
and I guess the tailing semicolon would be an issue. You could sed that away.
Last edited by seth (2021-11-19 21:44:25)
Offline
Can you please post a sample of the text you're actually dealing with?
awk '!(NR%3){print p ";" q ";" $0} {p=q; q=$0}'will combine tree lines into one and separate them w/ semicolons.
The common format is https://en.wikipedia.org/wiki/Comma-separated_values which will mandate some adjustments to the fields (notably quotation of commas) and the field separator is a comma, not a semicolon.
Also
I added semi colon to end of each line.
might be a problem, because the result will look like either
line1;line2;line3;
or
line1;;line2;;line3;
and I guess the tailing semicolon would be an issue. You could sed that away.
Hello
testing your awk command on:
https://file.re/2021/11/28/beta1/
It doesnt convert 3 lines to one!
Thankyou
OS: Arch &/ Debian
System: LENOVO ThinkPad E14
Desktop: Xfce
Offline
Because it's CRLF retarded "encoded" - https://en.wikipedia.org/wiki/Newline
awk '{ sub("\r$", ""); print }' beta1.txt > beta2.txt
awk '!(NR%3){print p ";" q ";" $0} {p=q; q=$0}' beta2.txtOffline
Because it's CRLF retarded "encoded" - https://en.wikipedia.org/wiki/Newline
awk '{ sub("\r$", ""); print }' beta1.txt > beta2.txt awk '!(NR%3){print p ";" q ";" $0} {p=q; q=$0}' beta2.txt
Hello!
This works ok. Thanks a lot!
2. As it imports in calc, it imports in LTR way, ... can this import change to RTL right here in awk command? Then I dont have to edit changes in the calc file, as with very large data libreoffice even hangs out, but text manipulation has no issues!
Regards,
OS: Arch &/ Debian
System: LENOVO ThinkPad E14
Desktop: Xfce
Offline
Do you want to re-order the columns or change the the direction of the text?
Is it currently wrong or does calc ignore the rtl condition (because I'm not sure inverting the glyphs, ie. "tac" is gonna produce the correct results)
(idk. whether libreoffice is bidi-capable but would have assumed so)
Offline
Do you want to re-order the columns or change the the direction of the text?
Is it currently wrong or does calc ignore the rtl condition (because I'm not sure inverting the glyphs, ie. "tac" is gonna produce the correct results)
(idk. whether libreoffice is bidi-capable but would have assumed so)
Hello
I think if I can find out how to invert the page settings from Left to Right in Calc, after importing the text file; then it should work!
Actually data is too large, moving the columns can do, but it will consume time.
Thankyou
Edit 1: Found the RTL option in drop down menu of sheet; in writer it uses the page for the same!
Last edited by makh (2021-12-01 20:24:38)
OS: Arch &/ Debian
System: LENOVO ThinkPad E14
Desktop: Xfce
Offline