[ILUG] RE: [ILUG] Filtering a file.

From: Stephen_Reilly at domain dell.com
Date: Thu 11 Jul 2002 - 15:33:02 IST


sed "/^\/XXX\/ CE/,/$/d" filename

To explain; the first pattern "/^\/XXX\/ CE/" matches what the pattern
should begin with. The second pattern "/$/" matches the first subsequent end
of line i.e. matches the end of the following line. The sed command "d"
deletes the matched two line pattern. Nice tidy, simple solution. This
negates the necesity of invoking two deletes for one deletion, i.e. find the
pattern, then delete, and then delete.

I like this as well though:
> sed -n '\¬^/XXX/ CE¬{n;d;};p' filename
looks funny :)

steve

> sed '/^\/XXX\/ CE/{d;d;}' filename
> or:
> sed '\¬^/XXX/ CE¬{d;d;}' filename
>
> _however_, these commands are not actually correct!
>
> they are not correct because `d' starts the next cycle,
> i.e., the next line is read and the program starts from the
> beginning. hence, the 2nd `d' is never executed, and thus
> the following line is printed. but it should have been
> "removed" (not printed) .... ;-(
>
> unfortunately, the obvious ed(1)-inspired fix doesn't work,
> at least with GNU `sed', which seems to lack the concept of
> address arithmetic:
>
> sed '\¬^/XXX/ CE¬,//+1d' filename # DOES NOT WORK
>
> instead, this somewhat obscure command does the trick:
>
> sed -n '\¬^/XXX/ CE¬{n;d;};p' filename
>
> what this does:
>
> -n … … … … … … … … never print anything automagically.
>
> \¬^/XXX/ CE¬{ … … starting with lines matching the RE
> `^/XXX/ CE' do the commands enclosed
> in braces `{ ... }'.
>
> n; … … … … … … … … forget the current (matching) line and
> read the next (following) line. (if `-n'
> was not specified, this would first print
> the current line.) hence, `sed' has read
> both the matching line and the following
> line, so now all we need to do is...
>
> d; … … … … … … … … forget (delete) the following line. this
> ends the program (for matching lines), so
> `sed' reads the next line and starts again.
>
> }; … … … … … … … … end of brace `{ ... }'-enclosed commands.
>
> p … … … … … … … … if we get this far, which could _only_
> happen if neither the current nor the
> previous line matched, print the line.
>
> the `sed' program now ends, so `sed' forgets what it just
> read and starts over again.
>
> |[ ... ]
> | Sed commands should be on separate lines,
>
> IMHO, it's a matter of taste/style. e.g., I'd normally write
> the above in mix of styles, as:
>
> sed -n -e '\¬^/XXX/ CE¬{n;d;}' -e p -- filename
>
> albeit it can be argued whether or not that is any clearer.
>
> | and the trailing }
> | needs to be on a line of it's own. Who knew!
>
> not quite. `}' is a command and hence needs to be separated
> from the other commands, either by `;' or a newline.
>
> the obscure topic of when `;'s are used in `sed' commands is,
> AFAIK, incompletely discussed in (most?) sed(1) manual pages;
> and the GNU sed(1) man page does not mention `;'s at all!
>
> the rule is simple: `;' can be used anyplace(?) a command-
> separating newline can be used. IMHO, part of the confusion
> arises because `}' is itself a _command_, unlike C/C++/awk/&tc
> (but similar to Bourne-ish shells), and hence must itself be
> separated from the other commands. (the other part to the
> confusion is `{' is not a command per se, and hence does not
> need to be separated.)
>
> | This would also work, I think...
> |
> | sed -n '/^\/XXX\/ CE/{
> | n
> | n
> | }
> | /^\/XXX\/ CE/! p' filename
>
> close but not quite. it will fail on the input:
>
> /XXX/ CE one, 1st line
> 2nd line
> /XXX/ CE two, 3rd line
> 4th line
>
> printing the `... 4th line'. the reason this fails is left as
> an exercise to the reader.
>
> | Sorry for the earlier misinformation.
>
> thanks for the corrections. I hope my comments above are also
> useful and not too misleading.
>
> cheers!
> -blf-
>
> | David Neary,
> | Marseille, France
> | E-Mail: bolsh at domain gimp.org
> --
> Innovative, very experienced, Unix and | Brian Foster
> Dublin, Ireland
> Chorus (embedded RTOS) kernel internals | e-mail:
> blf at domain utvinternet.ie
> expert looking for a new position ... | mobile: (+353
> or 0)86 854 9268
> For a résumé, contact me, or see my website
> http://www.blf.utvinternet.ie
>
> Stop E$$o (ExxonMobile): «Whatever you do, don't buy
> Esso --- they
> don't give a damn about global warming.»
> http://www.stopesso.com
> Supported by Greenpeace, Friends of the Earth, and
> numerous others...
>
> --
> Irish Linux Users' Group: ilug at domain linux.ie
> http://www.linux.ie/mailman/listinfo/ilug for
> (un)subscription information.
> List maintainer: listmaster at domain linux.ie
>



This archive was generated by hypermail 2.1.6 : Thu 06 Feb 2003 - 13:17:50 GMT