[SOLVED] Bash script swap colums and rows from .csv

SupKurtJ · 2021-06-30 21:58:35

Hi,

I have a file.csv like this :

A;B;C;D;E;F;G;H;I;J;K
1;2;3;4;5;6;7;8;9;10;11
aa;bb;cc;dd;ee;ff;gg;hh;ii;jj;kk

And I want to convert it to :
A;1;aa
B;2;bb
C:3:cc

Etc....

Do you have any idea please ?

Thanks

Last edited by SupKurtJ (2021-06-30 22:36:00)

ewaller · 2021-06-30 22:07:57

This does not address the use of Bash, but in my opinion, awk is the better tool for this.

SupKurtJ · 2021-06-30 22:24:10

I found a solution thanks

awk -F"," '{for(i=1;i<=NF;i++){A[NR,i]=$i};if(NF>n){n=NF}}
END{for(i=1;i<=n;i++){
for(j=1;j<=NR;j++){
s=s?s";"A[j,i]:A[j,i]}
print s;s=""}}' file

Last edited by SupKurtJ (2021-06-30 22:24:45)

bulletmark · 2021-06-30 22:53:04

#!/usr/bin/python
import sys
lines = [l.strip().split(';') for l in sys.stdin]
for n in range(len(lines)):
    print(';'.join(l[n] for l in lines))

Run it as `prog <file.csv`

jasonwryan · 2021-06-30 23:50:13

rs -c';' -C';' -T <file

https://www.freebsd.org/cgi/man.cgi?query=rs

Last edited by jasonwryan (2021-06-30 23:51:12)

bulletmark · 2021-07-01 00:00:42

@jasonwryan, so install BSD first?

ewaller · 2021-07-01 00:03:34

With Jason, I always feel like the guy with a hammer, saw, and screwdriver; He is the guy with the planner, joiner, lathe, radial arm saw, and 15 varieties of router.

jasonwryan · 2021-07-01 00:20:29

bulletmark wrote:

@jasonwryan, so install BSD first?

For a one-liner? Of course!

ewaller wrote:

He is the guy with the planner, joiner, lathe, radial arm saw, and 15 varieties of router.

We're all just waiting for Trilby to show up with a diesel powered chainsaw...

thiagowfx · 2021-07-01 20:43:36

Just to add more classic unix tools to the pile, the following seems to work as expected...

cat input | cut -f1-3 -d';' | tee >(awk -F ';' '{print $3}' | paste -sd ';') >(awk -F ';' '{print $2}' | paste -sd ';') >(awk -F ';' '{print $1}' | paste -sd ';')

porcelain1 · 2021-08-10 18:41:21

jasonwryan wrote:

rs -c';' -C';' -T <file
https://www.freebsd.org/cgi/man.cgi?query=rs

Fantastic tool! In a hurry I located rs source here https://cgit.freebsd.org/src/tree/usr.bin/rs/rs.c and compiled it after commenting out __FBSDID("$FreeBSD$");. I wonder if this and other such useful tools are somewhere on AUR or official repos, because I couldn't find them.

jasonwryan · 2021-08-10 18:43:09

There is a version on the AUR https://aur.archlinux.org/packages/rs-git/

porcelain1 · 2021-08-10 19:19:50

Wow thanks!

Trilby · 2021-08-10 19:21:41

Diesel powered chainsaw coming up:

#!/bin/sh

sep=";"

dir=$(mktemp -d)

for i in $(sed -n = "$1"); do
	sed -n "$i{s/$sep/\n/g;p}" "$1" > $dir/$i
done
paste -d"$sep" $dir/*

rm -r $dir

jasonwryan · 2021-08-10 19:21:50

Note that PKGBUILD needs a *lot* of work...

NuSkool · 2021-08-11 00:32:14

And for those who prefer the kiss principals (like some bash and awk one liners).

#!/bin/bash
# nitro (powered wood splitter)

for N in $(seq 11); do

	awk -v cn=$N -F";" '{print $cn}' "${1}"	|
	xargs					|
	awk '{print $1";"$2";"$3}'
done

Results:

$ ./nitro file.csv
A;1;aa
B;2;bb
C;3;cc
D;4;dd
E;5;ee
F;6;ff
G;7;gg
H;8;hh
I;9;ii
J;10;jj
K;11;kk

Last edited by NuSkool (2021-08-11 00:34:04)

Trilby · 2021-08-11 01:59:37

Well if you're going to hardcode the dimensions, it's much easier than that

for c in $(seq 11); do printf '%s;%s;%s\n' $(cut -d\; -f$c file.csv); done

EDIT: Whoa ... this can actually be made flexible though for any-dimension data (and other separators):

#!/bin/sh

sep=';'

lines=$(sed -n '$=' "$1")
cols=$(sed 1q "$1" | tr "$sep" '\n' | wc -l)

fmt=$(printf '%%s;%.0s' $(seq 2 $lines))
for c in $(seq $cols); do
	printf "$fmt%s\n" $(cut -d\; -f$c "$1")
done

But this (as many of those above) would choke on any fields with internal spaces.

Last edited by Trilby (2021-08-11 02:09:07)

Trilby · 2021-08-11 02:55:58

Here's the real diesel powered chain saw, a single sed process, nothing else at all. And this can work with any number of rows and columns and does just fine with spaces within fields:

sed -n '{H};${g;s/^\n//;:a;h;s/;[^\n]*/;/g;s/\n//g;s/;$//;p;g;s/[^;]*;\([^\n]*\)\n*/\1\n/g;/;/ba;s/\n/;/g;s/;$//;p;}' file.csv

In case anyone wants to actually read that, here's the better looking sed script (exact same content):

{
	H
}
$ {
	g
	s/^\n//

	:loop
	h
	s/;[^\n]*/;/g
	s/\n//g
	s/;$//
	p

	g
	s/[^;]*;\([^\n]*\)\n*/\1\n/g
	/;/ b loop

	s/\n/;/g
	s/;$//
	p
}

I wasn't going to be able to sleep until I did this in pure sed, the language so ugly it's beautiful.

And for educational purposes, a commented version:

# match every line ...
{
	# append each line to the hold buffer
	H
}
# do this only on / after the last line ...
$ {
	# pull the hold buffer into pattern space, and remove the extra initial
	# newline that resulted from the use of H rather than h on line 1
	g
	s/^\n//

	# we'll get back to this ... literally
	:loop
	# put a copy of whatever is in pattern space at the start of the loop back
	# into the hold buffer (on loop=1 this is already what's there)
	h
	# remove the first semicolon and everything between it and the next newline
	# replacing this with a single semicolon (i.e., get the first field in each
	# line)
	s/;[^\n]*/;/g
	# remove all the internal newlines joining the first field from each line
	# into a single line.  Remove the trailing semicolon, then print this first
	# column now as a line.
	s/\n//g
	s/;$//
	p

	# retreive a fresh copy of our content from the hold buffer (this is what was
	# stored at the start of the loop.
	g
	# trim and discard the first field of each line, keeping everything else
	s/[^;]*;\([^\n]*\)\n*/\1\n/g
	# if there are any remaining semicolons (i.e., more than one field left) loop
	/;/ b loop

	# were out of the loop, so there's only one field left (on each line) in the
	# pattern space.  Replace newlines with semicolons, the remove the last
	# semicolon and print our final column-now-line
	s/\n/;/g
	s/;$//
	p
}

FYI, some running-time comparisons of 1000 times through a loop using this sed version or the awk version in post #3:

TOOL            TIME
coreutils awk   3.17s
coreutils sed   2.54s
busybox awk     0.05s
busybox sed     0.05s

In otherwords, sed is generally faster than awk, but busybox leaves coreutils in the *&^*#ing dust.

Last edited by Trilby (2021-08-11 15:32:39)

Arch Linux

#1 2021-06-30 21:58:35

[SOLVED] Bash script swap colums and rows from .csv

#2 2021-06-30 22:07:57

Re: [SOLVED] Bash script swap colums and rows from .csv

#3 2021-06-30 22:24:10

Re: [SOLVED] Bash script swap colums and rows from .csv

#4 2021-06-30 22:53:04

Re: [SOLVED] Bash script swap colums and rows from .csv

#5 2021-06-30 23:50:13

Re: [SOLVED] Bash script swap colums and rows from .csv

#6 2021-07-01 00:00:42

Re: [SOLVED] Bash script swap colums and rows from .csv

#7 2021-07-01 00:03:34

Re: [SOLVED] Bash script swap colums and rows from .csv

#8 2021-07-01 00:20:29

Re: [SOLVED] Bash script swap colums and rows from .csv

#9 2021-07-01 20:43:36

Re: [SOLVED] Bash script swap colums and rows from .csv

#10 2021-08-10 18:41:21

Re: [SOLVED] Bash script swap colums and rows from .csv

#11 2021-08-10 18:43:09

Re: [SOLVED] Bash script swap colums and rows from .csv

#12 2021-08-10 19:19:50

Re: [SOLVED] Bash script swap colums and rows from .csv

#13 2021-08-10 19:21:41

Re: [SOLVED] Bash script swap colums and rows from .csv

#14 2021-08-10 19:21:50

Re: [SOLVED] Bash script swap colums and rows from .csv

#15 2021-08-11 00:32:14

Re: [SOLVED] Bash script swap colums and rows from .csv

#16 2021-08-11 01:59:37

Re: [SOLVED] Bash script swap colums and rows from .csv

#17 2021-08-11 02:55:58

Re: [SOLVED] Bash script swap colums and rows from .csv

Board footer