LINUX.IE, website of the Irish Linux Users' Group
Tux rules!

   
Home
New Users
Articles
Download
Projects
Community
Vendors

  Print Version
Email to...
 
Archives:


planetILUG

Recent News

News Archive


Join the
ILUG
on FaceBook


Join the
ILUG
on LinkedIn


Join the
ILUG SETI
Group



















 
 :: Mailing Lists

[ILUG] Remove duplicate lines from a file?

[ILUG] Remove duplicate lines from a file?

Niall O Broin niall at magicgoeshere.com
Fri Jun 30 16:49:34 IST 2000


On Fri, Jun 30, 2000 at 03:11:10PM +0100, Conor Daly wrote:

> Anyone know of a way to remove duplicate lines from a file using something
> like grep, sed or perl?  I want something l can use in a pipe preferably.
> The duplicates are distributed randomly through the file and I don't want to
> sort if I can avoid it.  Something like

Well, obviously the answer is Perl - now what was the question ? Uniq is out
of the question because it only works on sorted input, and the business of
prepending a number and then removing it offends me :-) so I offer

perl -ne 'print unless ($seen{$_}++)'

as a pipe to do the job. There's one slight hitch - this will consume memory
like there's no tomorrow. If the file(s) you want to treat are somewhat
smaller than your free virtual memory, you'll be OK. 




Regards,


Niall


Bon weekend !




More information about the ILUG mailing list
Read this without the formatting.
                                                                                                    

 

Hosted by HEAnet


Maintained by the ILUG website team. The aim of Linux.ie is to support and help commercial and private users of Linux in Ireland. You can display ILUG news in your own webpages, read backend information to find out how. Networking services kindly provided by HEAnet, server kindly donated by Dell. Linux is a trademark of Linus Torvalds, used with permission. No penguins were harmed in the production or maintenance of this highly praised website. Looking for the Indian Linux Users' Group? Try here. If you've read all this and aren't a lawyer: you should be!
RSS Version
Powered by Dell