LINUX.IE, website of the Irish Linux Users' Group
Tux rules!

   
Home
New Users
Articles
Download
Projects
Community
Vendors

  Print Version
Email to...
 
Archives:


planetILUG

Recent News

News Archive


Join the
ILUG
on FaceBook


Join the
ILUG
on LinkedIn


Join the
ILUG SETI
Group



















 
 :: Mailing Lists

[ILUG] Remove duplicate lines from a file?

[ILUG] Remove duplicate lines from a file?

Conor Daly conor.daly at oceanfree.net
Fri Jun 30 22:34:35 IST 2000


-----Original Message-----
From: Niall O Broin <niall at magicgoeshere.com>
To: Conor Daly <conor.daly at oceanfree.net>
Cc: ilug at linux.ie <ilug at linux.ie>
Date: 30 June 2000 18:31
Subject: Re: [ILUG] Remove duplicate lines from a file?


>so I offer
>
>perl -ne 'print unless ($seen{$_}++)'
>
>as a pipe to do the job. There's one slight hitch - this will consume
memory
>like there's no tomorrow. If the file(s) you want to treat are somewhat
>smaller than your free virtual memory, you'll be OK.
>
>Regards,


>Niall


-----Original Message-----
From: Gordon McCormick <gordon-ilug at esatclear.ie>
To: Conor Daly <conor.daly at oceanfree.net>
Date: 30 June 2000 18:30
Subject: Re: [ILUG] Remove duplicate lines from a file?

>#!/usr/bin/perl
>
>while(<>) {
>  unless ($line{$_}) {
>    print $_;
>    $line{$_} = 1;
>  }
>}
>
>---
>
>It's ugly, but it should work, and no sorting required.
>
>gordon

I wish I knew some perl so I'd know how these work.  Ah, some day....

I'm not *quite* being lazy in trying to avoid sorting.

The input files are generated by cat ing about 700 files through grep to
extract certain lines.  These files overlap chronologically but should be in
chronological order after a unique operation.  The keys to sort on are dates
with 2-digit years so Y2K appears before 1999 in the sorted version (I
*could* sed the first '00' or '99' to return '2000' or '1999' for the sort
to work properly but that might affect the next guy in the processing chain
since he dealt with the 1998 / 99 version of the data last year and is all
scripted for the data in its current form (don't you just *love* long
rambling parentheses... (I should probably be bracing these like nested
functions :-} ))).

Thanks guys, I'm probably OK for memory for Niall's hungry pipe (I'll have
the box to myself on night duty).

---
Conor Daly

Ph   +353 1 8326146

conor.daly at oceanfree.net
------------------------------------------





More information about the ILUG mailing list
Read this without the formatting.
                                                                                                    

 

Hosted by HEAnet


Maintained by the ILUG website team. The aim of Linux.ie is to support and help commercial and private users of Linux in Ireland. You can display ILUG news in your own webpages, read backend information to find out how. Networking services kindly provided by HEAnet, server kindly donated by Dell. Linux is a trademark of Linus Torvalds, used with permission. No penguins were harmed in the production or maintenance of this highly praised website. Looking for the Indian Linux Users' Group? Try here. If you've read all this and aren't a lawyer: you should be!
RSS Version
Powered by Dell