You are not logged in.
Pages: 1
I have a .csv which is an export of a simple voxel model:
8,4,1
#4F4530FF,#4F442FFF,#4E442FFF,#4E442FFF,#4B412DFF,#4E442FFF,#4F442FFF,#514631FF
#645639FF,#615337FF,#695A3CFF,#68593BFF,#615337FF,#67593BFF,#605337FF,#68593BFF
#4E442FFF,#4F442FFF,#514631FF,#4C422EFF,#4C422DFF,#4C422DFF,#4F4530FF,#4F4530FF
#6E5F3EFF,#6D5E3EFF,#6C5D3DFF,#6B5C3DFF,#706140FF,#695A3BFF,#6E5F3EFF,#695A3BFF
I needed to parse it down to just the sorted unique hex values. So obviously,
grep -o '#......' model.csv |sort -u
But the final output was intended for .json list so this would be even better:
"#000000",
...
"#ffffff"
Sure, I could add a pipe to sed or such, but I'm a masochist so I decided to do it all in awk. I'd never done anything more complicated than a one-liner before, so criticism is appreciated:
BEGIN{
FS = ","
hex_regexp = "^#[0-9a-fA-F]{8}$"
delete all_vals[0]
}
/#/{
for (i=1; i<=NF; i++)
{
if ( $i ~ hex_regexp )
{
len = length(all_vals) + 1
all_vals[len] = substr($i, 0, 7)
}
}
}
END{
PROCINFO["sorted_in"] = "@val_str_asc"
for (i in all_vals)
{
a = all_vals[i]
matched = 0
for (j in unique_vals)
{
b = unique_vals[j]
if (a == b)
{
matched = 1
break
}
}
if (matched == 0)
{
len = length(unique_vals) + 1
unique_vals[len] = a
}
}
for (i in unique_vals)
{
if ( i != length(unique_vals) )
print "\"" unique_vals[i] "\","
else
print "\"" unique_vals[i] "\""
}
}
One thing that's quirky is the need to delete all_vals[0] , which AFAICT is necessary to "initialize" it as an array. Otherwise awk throws an error:
$ awk -f voxel_colors.awk voxel.csv
awk: voxel_colors.awk:12: (FILENAME=voxel.csv FNR=2) fatal: attempt to use scalar `all_vals' as an array
I only knew to do that thanks to SO. It seems for (i in var) also initializes var as an array (only if first reference?), which is presumably why delete unique_vals[0] is not necessary.
Edit: although the solution jason led me to is best, I realized my original END block can be cut down to
END{
PROCINFO["sorted_in"] = "@val_str_asc"
b = ""
for (i in all_vals)
{
a = all_vals[i]
if ( a != b ) {
b = a
printf "\"%s\",\n", a
}
}
}
Last edited by alphaniner (2016-01-23 18:44:42)
But whether the Constitution really be one thing, or another, this much is certain - that it has either authorized such a government as we have had, or has been powerless to prevent it. In either case, it is unfit to exist.
-Lysander Spooner
Offline
Would this not be simpler?
awk 'BEGIN { RS=","} /^#/ {printf "%s,\n", $1}' file
Offline
It misses the first entry on each line. RS = "[\n,]" seems to solve that. Doing something like that was my first thought, but I didn't know if it was possible so I went with the for loop. Thanks for pointing it out.
However, I wasn't aiming for simplicity anyway; I wanted to fully replicate eg.
grep -o '#......' model.csv |sort -u |awk 'print{"\""$0"\","}'
with a single awk program & no pipes.
But whether the Constitution really be one thing, or another, this much is certain - that it has either authorized such a government as we have had, or has been powerless to prevent it. In either case, it is unfit to exist.
-Lysander Spooner
Offline
Ugh: completely missed the uniq bit, sorry.
Give this a shot:
awk 'BEGIN { RS="," } /^#/ { !a[$1]++ } END {for (b in a) { printf "\"%s,\"\n", b }}' file
Offline
Wow. Assigning to indices to avoid the need for manual uniq-ing is a great trick. Thanks! It seems that references the key is all that's necessary to have it added. Is there a reason you use !a[$1]++ rather than just a[$1] ?
I still need to modify the RS to include "\n" and specify array scan order, and verify only records of the correct form are included just to be safe. Putting it all together:
BEGIN {
RS="[\n,]"
regex = "^#[0-9a-fA-F]{8}$"
}
$0 ~ regex { a[substr($1,0,7)] }
END {
PROCINFO["sorted_in"] = "@ind_str_asc"
for (b in a) {
printf "\"%s\",\n", b
}
}
But whether the Constitution really be one thing, or another, this much is certain - that it has either authorized such a government as we have had, or has been powerless to prevent it. In either case, it is unfit to exist.
-Lysander Spooner
Offline
No, it was late and I was tired, so wasn't thinking all that clearly (missing the uniq requirement was a bit of a tell )...
Glad you got is sorted (no pun intended).
Offline
Pages: 1