| Date: Mon, 02 Feb 2004 20:12:10 +0100
| From: Brian Foster <blf at utvinternet.ie>
|[ ... ]
| be aware, however, that both uniq(1) and `comm'
| require the input to be sorted.
| ( I must confess I have never understood why the
| input must be sorted. `uniq' could still deal
| with _adjacent_ duplicate lines (e.g., N and N+1),
| and `comm' could compare line N with line N.
| why the insistence on sorting? ) [ ... ]
sorry for replying to my own post.
the above is (almost) all gibberish.
uniq(1) does not require its input to be sorted,
unless you want to find/remove all duplicates.
comm(1) requires sorted input because making
a tri-state decision (file1-only, file2-only,
or both-files) on the basis of a line-by-line
(bi-valued) comparison is computationally infeasible.
now I just hope the work-related e-mail I sent
late last night is more coherent than the above
nonsense! ;-\
cheers!
-blf-
p.s. my clear(er?) memory now in the morning is it was
older, any maybe only non-GNU, versions of join(1)
which inexplicitly insisted on sorted input.
however, I note the GNU join man(1) page does not
mention sorting, so perhaps that nonsense has been
removed. or I am just imagining it?
--
«How many surrealists does it take to | Brian Foster Montpellier,
change a lightbulb? Three. One calms | blf at utvinternet.ie France
the warthog, and two fill the bathtub | Stop E$$o (ExxonMobile)!
with brightly-colored machine tools.» | http://www.stopesso.com
Maintained by the ILUG website team. The aim of Linux.ie is to
support and help commercial and private users of Linux in Ireland. You can
display ILUG news in your own webpages, read backend
information to find out how. Networking services kindly provided by HEAnet, server kindly donated by
Dell. Linux is a trademark of Linus Torvalds,
used with permission. No penguins were harmed in the production or maintenance
of this highly praised website. Looking for the
Indian Linux Users' Group? Try here. If you've read all this and aren't a lawyer: you should be!