You are not logged in.
Hi,
I have a file.csv like this :
A;B;C;D;E;F;G;H;I;J;K
1;2;3;4;5;6;7;8;9;10;11
aa;bb;cc;dd;ee;ff;gg;hh;ii;jj;kk
And I want to convert it to :
A;1;aa
B;2;bb
C:3:cc
Etc....
Do you have any idea please ?
Thanks
Last edited by SupKurtJ (2021-06-30 22:36:00)
Offline
This does not address the use of Bash, but in my opinion, awk is the better tool for this.
Nothing is too wonderful to be true, if it be consistent with the laws of nature -- Michael Faraday
Sometimes it is the people no one can imagine anything of who do the things no one can imagine. -- Alan Turing
---
How to Ask Questions the Smart Way
Offline
I found a solution thanks
awk -F"," '{for(i=1;i<=NF;i++){A[NR,i]=$i};if(NF>n){n=NF}}
END{for(i=1;i<=n;i++){
for(j=1;j<=NR;j++){
s=s?s";"A[j,i]:A[j,i]}
print s;s=""}}' file
Last edited by SupKurtJ (2021-06-30 22:24:45)
Offline
#!/usr/bin/python
import sys
lines = [l.strip().split(';') for l in sys.stdin]
for n in range(len(lines)):
print(';'.join(l[n] for l in lines))
Run it as `prog <file.csv`
Offline
rs -c';' -C';' -T <file
https://www.freebsd.org/cgi/man.cgi?query=rs
Last edited by jasonwryan (2021-06-30 23:51:12)
Offline
@jasonwryan, so install BSD first?
Offline
With Jason, I always feel like the guy with a hammer, saw, and screwdriver; He is the guy with the planner, joiner, lathe, radial arm saw, and 15 varieties of router.
Nothing is too wonderful to be true, if it be consistent with the laws of nature -- Michael Faraday
Sometimes it is the people no one can imagine anything of who do the things no one can imagine. -- Alan Turing
---
How to Ask Questions the Smart Way
Offline
@jasonwryan, so install BSD first?
For a one-liner? Of course!
He is the guy with the planner, joiner, lathe, radial arm saw, and 15 varieties of router.
We're all just waiting for Trilby to show up with a diesel powered chainsaw...
Offline
Just to add more classic unix tools to the pile, the following seems to work as expected...
cat input | cut -f1-3 -d';' | tee >(awk -F ';' '{print $3}' | paste -sd ';') >(awk -F ';' '{print $2}' | paste -sd ';') >(awk -F ';' '{print $1}' | paste -sd ';')
Offline
rs -c';' -C';' -T <file
Fantastic tool! In a hurry I located rs source here https://cgit.freebsd.org/src/tree/usr.bin/rs/rs.c and compiled it after commenting out __FBSDID("$FreeBSD$");. I wonder if this and other such useful tools are somewhere on AUR or official repos, because I couldn't find them.
Behemoth, wake up!
Offline
There is a version on the AUR https://aur.archlinux.org/packages/rs-git/
Offline
Wow thanks!
Behemoth, wake up!
Offline
Diesel powered chainsaw coming up:
#!/bin/sh
sep=";"
dir=$(mktemp -d)
for i in $(sed -n = "$1"); do
sed -n "$i{s/$sep/\n/g;p}" "$1" > $dir/$i
done
paste -d"$sep" $dir/*
rm -r $dir
"UNIX is simple and coherent" - Dennis Ritchie; "GNU's Not Unix" - Richard Stallman
Offline
Note that PKGBUILD needs a *lot* of work...
Offline
And for those who prefer the kiss principals (like some bash and awk one liners).
#!/bin/bash
# nitro (powered wood splitter)
for N in $(seq 11); do
awk -v cn=$N -F";" '{print $cn}' "${1}" |
xargs |
awk '{print $1";"$2";"$3}'
done
Results:
$ ./nitro file.csv
A;1;aa
B;2;bb
C;3;cc
D;4;dd
E;5;ee
F;6;ff
G;7;gg
H;8;hh
I;9;ii
J;10;jj
K;11;kk
Last edited by NuSkool (2021-08-11 00:34:04)
Scripts I use: https://github.com/Cody-Learner
Offline
Well if you're going to hardcode the dimensions, it's much easier than that
for c in $(seq 11); do printf '%s;%s;%s\n' $(cut -d\; -f$c file.csv); done
EDIT: Whoa ... this can actually be made flexible though for any-dimension data (and other separators):
#!/bin/sh
sep=';'
lines=$(sed -n '$=' "$1")
cols=$(sed 1q "$1" | tr "$sep" '\n' | wc -l)
fmt=$(printf '%%s;%.0s' $(seq 2 $lines))
for c in $(seq $cols); do
printf "$fmt%s\n" $(cut -d\; -f$c "$1")
done
But this (as many of those above) would choke on any fields with internal spaces.
Last edited by Trilby (2021-08-11 02:09:07)
"UNIX is simple and coherent" - Dennis Ritchie; "GNU's Not Unix" - Richard Stallman
Offline
Here's the real diesel powered chain saw, a single sed process, nothing else at all. And this can work with any number of rows and columns and does just fine with spaces within fields:
sed -n '{H};${g;s/^\n//;:a;h;s/;[^\n]*/;/g;s/\n//g;s/;$//;p;g;s/[^;]*;\([^\n]*\)\n*/\1\n/g;/;/ba;s/\n/;/g;s/;$//;p;}' file.csv
In case anyone wants to actually read that, here's the better looking sed script (exact same content):
{
H
}
$ {
g
s/^\n//
:loop
h
s/;[^\n]*/;/g
s/\n//g
s/;$//
p
g
s/[^;]*;\([^\n]*\)\n*/\1\n/g
/;/ b loop
s/\n/;/g
s/;$//
p
}
I wasn't going to be able to sleep until I did this in pure sed, the language so ugly it's beautiful.
And for educational purposes, a commented version:
# match every line ...
{
# append each line to the hold buffer
H
}
# do this only on / after the last line ...
$ {
# pull the hold buffer into pattern space, and remove the extra initial
# newline that resulted from the use of H rather than h on line 1
g
s/^\n//
# we'll get back to this ... literally
:loop
# put a copy of whatever is in pattern space at the start of the loop back
# into the hold buffer (on loop=1 this is already what's there)
h
# remove the first semicolon and everything between it and the next newline
# replacing this with a single semicolon (i.e., get the first field in each
# line)
s/;[^\n]*/;/g
# remove all the internal newlines joining the first field from each line
# into a single line. Remove the trailing semicolon, then print this first
# column now as a line.
s/\n//g
s/;$//
p
# retreive a fresh copy of our content from the hold buffer (this is what was
# stored at the start of the loop.
g
# trim and discard the first field of each line, keeping everything else
s/[^;]*;\([^\n]*\)\n*/\1\n/g
# if there are any remaining semicolons (i.e., more than one field left) loop
/;/ b loop
# were out of the loop, so there's only one field left (on each line) in the
# pattern space. Replace newlines with semicolons, the remove the last
# semicolon and print our final column-now-line
s/\n/;/g
s/;$//
p
}
FYI, some running-time comparisons of 1000 times through a loop using this sed version or the awk version in post #3:
TOOL TIME
coreutils awk 3.17s
coreutils sed 2.54s
busybox awk 0.05s
busybox sed 0.05s
In otherwords, sed is generally faster than awk, but busybox leaves coreutils in the *&^*#ing dust.
Last edited by Trilby (2021-08-11 15:32:39)
"UNIX is simple and coherent" - Dennis Ritchie; "GNU's Not Unix" - Richard Stallman
Offline