You are not logged in.
Hi,
Is there any tool for detecting the maximum common content in one file.
For example:
line 1
line 2
line 1
line 3
line 1
line 2
is a line that has lines 1,2 duplicated by lines 7, 8. And line 1 duplicates line 4, 6.
uniq or sort assume you have an structured file. So these don't work for me.
Thanks in advance,
Xan
Last edited by xanb (2013-10-22 10:25:23)
Owning one OpenRC (artoo way) and other three systemd machines
Offline
Are you saying that
sort <file> | uniq -cd
is not for you? Why?
Last edited by karol (2013-10-21 14:13:37)
Offline
Are you saying that
sort <file> | uniq -cd
is not for you? Why?
Can't the same ting be achieved with
sort -u <file>
?
Offline
karol wrote:Are you saying that
sort <file> | uniq -cd
is not for you? Why?
Can't the same ting be achieved with
sort -u <file>
?
Ummm, not necessarily.
$ sort -u <file>
line 1
line 2
line 3
$ sort <file> | uniq -cd
2
3 line 1
2 line 2
The latter tells you which lines (not which numbers, it prints the "contents" of the line in question) are duplicated, triplicated etc.
Offline
Thanks a lot, both of you. This is really what I want.
Owning one OpenRC (artoo way) and other three systemd machines
Offline