You are not logged in.

#1 2013-06-15 13:30:05

arith
Member
Registered: 2012-04-15
Posts: 5

[bash] How to uniq files in a file list with paths?

Hi folks,

I know I can get rid of duplicate entries in a list by sorting and uniqing:

sort file | uniq

But how can I uniq filenames in a file list with paths?

/foo/abc.bar
/foo/def.bar
/foo/bar/def.bar
/foo/ghi.bar

Is it necessary to write my own comparison routine like “output thisline if basename(thisline) != basename(lastline)” (would require a basename-sorted input list)?

Offline

#2 2013-06-15 14:14:19

karol
Archivist
Registered: 2009-05-06
Posts: 25,430

Re: [bash] How to uniq files in a file list with paths?

arith wrote:

Hi folks,

I know I can get rid of duplicate entries in a list by sorting and uniqing:

sort file | uniq

Or 'sort -u':

$ cat test
a
e
b
d
c
a
d
$ sort -u test
a
b
c
d
e
arith wrote:

But how can I uniq filenames in a file list with paths?

/foo/abc.bar
/foo/def.bar
/foo/bar/def.bar
/foo/ghi.bar

Is it necessary to write my own comparison routine like “output thisline if basename(thisline) != basename(lastline)” (would require a basename-sorted input list)?

I came up with

$ cat test
/home/karol/test/test0
/home/karol/test/test/test0
/home/karol/test/foo/test00
/home/karol/test/foo/bar/test0
/home/karol/test/foo/bar/test1
/home/karol/test1
$ awk -F/ '{print $F" "$NF}' test | sort -k2 | uniq --skip-fields=1 | awk '{print $1}'
/home/karol/test/foo/bar/test0
/home/karol/test/foo/test00
/home/karol/test/foo/bar/test1

but I think there's a better way to do it ;P

Offline

#3 2013-06-15 14:28:43

Procyon
Member
Registered: 2008-05-07
Posts: 1,819

Re: [bash] How to uniq files in a file list with paths?

@karol: One thing you should note is that you want to use stable sort here, sort -sk2 instead of sort -k2, so that entries that were first get printed.

An associative array would work too for this problem and you won't have the issue of spaces in filenames.

Offline

#4 2013-06-15 14:36:11

wirr
Member
Registered: 2009-10-25
Posts: 68

Re: [bash] How to uniq files in a file list with paths?

karol wrote:
arith wrote:

Hi folks,

I know I can get rid of duplicate entries in a list by sorting and uniqing:

sort file | uniq

Or 'sort -u':

I came up with

[...] sort -k2 | uniq --skip-fields=1 [...]

Or 'sort -u -k2,2' smile

Last edited by wirr (2013-06-15 14:36:26)

Offline

#5 2013-06-15 15:49:12

Trilby
Forum Moderator
From: Massachusetts, USA
Registered: 2011-11-29
Posts: 13,992
Website

Re: [bash] How to uniq files in a file list with paths?

awk -F / '{list[$NF]=$0} END{for(base in list) print list[base]}' test

awk ftw.

Last edited by Trilby (2013-06-15 15:50:45)


InterrobangSlider
• How's my coding? See this page.
• How's my moderating? Feel free to email any concerns, complaints, or objections.

Offline

#6 2013-06-15 19:01:49

Procyon
Member
Registered: 2008-05-07
Posts: 1,819

Re: [bash] How to uniq files in a file list with paths?

@Trilby: You can keep the same order with a test like this: if (list[$NF] == "") before the assignment so it's not overwritten.

Offline

Board footer

Powered by FluxBB