LINUX.IE, website of the Irish Linux Users' Group
Tux rules!

   
Home
New Users
Articles
Download
Projects
Community
Vendors

  Print Version
Email to...
 
Archives:


planetILUG

Recent News

News Archive


Join the
ILUG
on FaceBook


Join the
ILUG
on LinkedIn


Join the
ILUG SETI
Group



















 
 :: Mailing Lists

[ILUG] Filtering a file.

[ILUG] Filtering a file.

Brian Foster blf at utvinternet.ie
Thu Jul 11 15:11:26 IST 2002


  | Date: Thu, 11 Jul 2002 09:28:11 +0200
  | From: David Neary <dneary at wanadoo.fr>
  | 
  | David Neary [ previously ] wrote:
  | > Aherne Peter-pahern02 wrote:
  | > > What I want to do is get any line starting with /XXX/ CE and remove
  | > > that line and the following one.  [ ... ]
  | > 
  | > OK - this may not work, but the idea is sound enough.
  | > 
  | > sed '/^\/XXX\/ CE/{d;d}' filename
                            ↑
 close, but not quite.      │
 yer missing a `;' here, ───┘
 after the 2nd `d' before
 the closing `}', i.e.:

      sed '/^\/XXX\/ CE/{d;d;}' filename
 or:
      sed '\¬^/XXX/ CE¬{d;d;}' filename

 _however_, these commands are not actually correct!

 they are not correct because `d' starts the next cycle,
 i.e., the next line is read and the program starts from the
 beginning.  hence, the 2nd `d' is never executed, and thus
 the following line is printed.  but it should have been
 "removed" (not printed) ....   ;-(

 unfortunately, the obvious ed(1)-inspired fix doesn't work,
 at least with GNU `sed', which seems to lack the concept of
 address arithmetic:

      sed '\¬^/XXX/ CE¬,//+1d' filename    # DOES NOT WORK

 instead, this somewhat obscure command does the trick:

      sed -n '\¬^/XXX/ CE¬{n;d;};p' filename

 what this does:

  -n … … … … … … … … never print anything automagically.

  \¬^/XXX/ CE¬{  … … starting with lines matching the RE
                     `^/XXX/ CE' do the commands enclosed
                     in braces `{ ... }'.

  n; … … … … … … … … forget the current (matching) line and
                     read the next (following) line.  (if `-n'
                     was not specified, this would first print
                     the current line.)  hence, `sed' has read
                     both the matching line and the following
                     line, so now all we need to do is...

  d; … … … … … … … … forget (delete) the following line.  this
                     ends the program (for matching lines), so
                     `sed' reads the next line and starts again.

  }; … … … … … … … … end of brace `{ ... }'-enclosed commands.

  p  … … … … … … … … if we get this far, which could _only_
                     happen if neither the current nor the
                     previous line matched, print the line.

 the `sed' program now ends, so `sed' forgets what it just
 read and starts over again.

  |[ ... ]
  | Sed commands should be on separate lines,

 IMHO, it's a matter of taste/style.  e.g., I'd normally write
 the above in mix of styles, as:

      sed -n -e '\¬^/XXX/ CE¬{n;d;}' -e p -- filename

 albeit it can be argued whether or not that is any clearer.

  |                                           and the trailing }
  | needs to be on a line of it's own. Who knew!

 not quite.  `}' is a command and hence needs to be separated
 from the other commands, either by `;' or a newline.

 the obscure topic of when `;'s are used in `sed' commands is,
 AFAIK, incompletely discussed in (most?) sed(1) manual pages;
 and the GNU sed(1) man page does not mention `;'s at all!

 the rule is simple:  `;' can be used anyplace(?) a command-
 separating newline can be used.  IMHO, part of the confusion
 arises because `}' is itself a _command_, unlike C/C++/awk/&tc
 (but similar to Bourne-ish shells), and hence must itself be
 separated from the other commands.   (the other part to the
 confusion is `{' is not a command per se, and hence does not
 need to be separated.)

  | This would also work, I think...
  | 
  | sed -n '/^\/XXX\/ CE/{
  | n
  | n
  | }
  | /^\/XXX\/ CE/! p' filename

 close but not quite.  it will fail on the input:

    /XXX/ CE one, 1st line
                  2nd line
    /XXX/ CE two, 3rd line
                  4th line

 printing the `... 4th line'.  the reason this fails is left as
 an exercise to the reader.

  | Sorry for the earlier misinformation.

 thanks for the corrections.  I hope my comments above are also
 useful and not too misleading.

cheers!
	-blf-

  |        David Neary,
  |     Marseille, France
  |   E-Mail: bolsh at gimp.org
--
 Innovative, very experienced, Unix and      | Brian Foster    Dublin, Ireland
 Chorus (embedded RTOS) kernel internals     | e-mail: blf at utvinternet.ie
 expert looking for a new position ...       | mobile: (+353 or 0)86 854 9268
  For a résumé, contact me, or see my website  http://www.blf.utvinternet.ie

    Stop E$$o (ExxonMobile):  «Whatever you do, don't buy Esso --- they
     don't give a damn about global warming.»    http://www.stopesso.com
     Supported by Greenpeace, Friends of the Earth, and numerous others...




More information about the ILUG mailing list
Read this without the formatting.
                                                                                                    

 

Hosted by HEAnet


Maintained by the ILUG website team. The aim of Linux.ie is to support and help commercial and private users of Linux in Ireland. You can display ILUG news in your own webpages, read backend information to find out how. Networking services kindly provided by HEAnet, server kindly donated by Dell. Linux is a trademark of Linus Torvalds, used with permission. No penguins were harmed in the production or maintenance of this highly praised website. Looking for the Indian Linux Users' Group? Try here. If you've read all this and aren't a lawyer: you should be!
RSS Version
Powered by Dell